collective / pas.plugins.ldap

Zope (and Plone) PAS Plugin providing users and groups from LDAP directory
http://pypi.python.org/pypi/pas.plugins.ldap
Other
13 stars 20 forks source link

Plugin creates insane amount of LDAP queries #30

Closed datakurre closed 6 years ago

datakurre commented 7 years ago

I'm sorry for provocative title.

I wonder, would you have any unimplemented ideas for optimizing the amount LDAP queries made by pas.plugins.ldap and node.ext.ldap?

The current implementation creates way too much queries (cache only helps with repeating queries). Most of those might be, because of the design of PAS (it's hard to e.g. query defails for multiple groups at once), but some are also by node.ext.ldap. For example, each DN is queried multiple times with different set of attrs (and finally LDAPAttributesBehavior queries for "*" attrs). In short, it should be faster to make few big queries than a lot of small ones.

I'm currently working on an in-house fork to optimize the amount of queries with any means possible. Any idea are welcome, and I'd be happy to contribute, if possible any generic solution appears.

jensens commented 7 years ago

yes, caching helps and its the recommended way. Using plain node.ext.ldap is way more efficient, but even there are optimizations possible.

datakurre commented 7 years ago

My first iteration is to enforce that all queries are done with configured attributes (with the same attrlist for the same object) so each object is only searched once.

jensens commented 7 years ago

sounds like a good idea. this may result in more complex queries than at the end needed - and so while doing iterations this might slow down things, but ensures to have all data loaded later and avoid additional queries.

datakurre commented 7 years ago

Yes. That may make my node.ext.ldap changes not suitable for upstream. But with PAS, the architecture of PAS ensures that I cannot make too complex queries (it always need all mapped properties and group memberships).

rnixx commented 7 years ago

Can you provide more details how your LDAP is structured, especially the group mapping? Any may you provide a log of queries made?

djay commented 6 years ago

@datakurre any progress on this?

datakurre commented 6 years ago

@djay Yes and no. I did normalize & optimize the queries in my fork, but that resulted into so big result sets that memcached libraries started failing. I tried several, believed to have found the winner in https://pypi.python.org/pypi/libmc/1.1.0, but then our users started seeing random "Unauthorized" pages under higher load. We had to fell back to plone.app.ldap without groups, and are still thinking what next.

Our ldap used memberOf attributes for linking users to groups. I don't think there were anything special in its structure.

So, I close this issue, because I did not manage to provide log of the queries. "Insane" was too provocative, because when I wrote it, I was testing with Plone in debug mode where PAS caching was not active.

If you have resources for logging and benchmarking, you could still try out my old forks