noirello / bonsai

Simple Python 3 module for LDAP, using libldap2 and winldap C libraries.
MIT License
118 stars 33 forks source link

SizeLimitError exception #92

Closed johnwilliams57-nhs closed 1 week ago

johnwilliams57-nhs commented 1 week ago

I'm doing a query that could return thousands of results, but I just want the first 250.

When I try and use search with a sizelimit though, it just throws the SizeLimitError, and the exception doesn't contain the results.

try:
    results = await ldap.conn.search(
        base="ou=organizations,dc=example,dc=com",
        scope=LDAPSearchScope.SUBTREE,
        filter_exp=f"(&(objectClass=*)(o=*{search_term}*))",
        sizelimit=250,  
    )
except SizeLimitError as e:
    raise Exception("Search exceeded size limit (250) and no results were returned") from e

I've also tried using paged_search with sizelimit of 0 (and 250) and page_size 250, but that also raised the same exception.

Can somebody enlighten me in how to achieve it please?

noirello commented 1 week ago

Hi, so something like this:

    results = await ldap.conn.paged_search(
        base="ou=organizations,dc=example,dc=com",
        scope=LDAPSearchScope.SUBTREE,
        filter_exp=f"(&(objectClass=*)(o=*{search_term}*))",
        page_size=250,  
    )
    for entry in results:
        ...

Also raised a SizeLimitError ?

johnwilliams57-nhs commented 1 week ago

Hi. thanks for getting back to me!

yes, I sometimes get the exception with paged_search..

I think my server has a size limit set of 5000.

So, the above works ok for queries that return < 5000 results. but then over that i get:

Size limit exceeded. This search operation has sent the maximum of 5000 entries to the client (0x0004 [4])

Also, if I set size_limit to 0 and page_size to 250, isn't it going to bring thousands of entries back over the wire? then discard all but the first 250?

noirello commented 1 week ago

Yes, you can't get more results than the server-side size limit. Not even with paged search. When you try to get the 5001st entry, it will raise an exception.

Paged search will get page size entries from the server with a single request. By default, the paged_search method returns a special iteration object that will initiate a new request to the server when it reaches the last entry of the page. You can disable this auto acquire mode. See details in the docs.

You might need to use virtual_list_search to get very specific set of entries for your query.