4teamwork / ftw.solr

Solr integration for Plone
5 stars 5 forks source link

Patch reindexObjectSecurity to optimize indexing of large trees. #124

Closed lukasgraf closed 5 years ago

lukasgraf commented 5 years ago

This re-applies the patch we originally had in the ftw.solr 1.x line.

This patch optimizes CatalogAware.reindexObjectSecurity() so that it performs substantially better for large trees of objects. The optimization relies on the fact that when recursively reindexing security for a subtree, recursion can safely be terminated as soon as an object is encountered that didn't experience a change to the contents of its indexed security.

Without this patch, and

With this patch, irrespective of the version of collective.solr,


Reasoning why terminating recursion early is safe

(copied from the respective docstring)

CMFCatalogAware.reindexObjectSecurity() needs to be recursive because changes to an object's security may affect contained subobjects.

Indexed security for objects in Plone can only be influenced by their parents via some kind of inheritance. There's exactly two inheritance mechanisms in play:

  • Acquisition of permissions via the obj's security settings (manage_access)
  • Inheritance of local roles

(An obj's security settings can indirectly be managed via workflows. This doesn't matter here though, it's irrelevant how exactly they came to be).

In both of these cases, only the immediate parent of a subobject is relevant for inheritance. Therefore, if an object's indexed security didn't experience any changes, neither can any of its subobjects - recursively.

Because of this, downstream propagation can be stopped as soon as an object is encountered whose indexed security didn't change.


In the case of workflow changes, reindexObjectSecurity() still needs to be called for every object that is directly affected by the change:

  • switched to a different WF
  • to a different WF state
  • changes in the WF state's security settings
  • object moved to/out of a placeful WF).

Identifying these objects is usually easy, and you can't and must not rely on reinexObjectSecurity's recursion to pick them up. Instead, they can be determined by querying for the relevant criteria, e.g. objects that have a particular WF.

Recursion will then take care of updating security indexes for affected subobjects that don't meet the criteria for being directly affected (like subobjects with a different workflow).

See 4teamwork/opengever.core#4759 for some more background.

buchi commented 5 years ago

@lukasgraf can you rebase HISTORY.txt?

lukasgraf commented 5 years ago

@buchi rebased - I proposed 2.3.0 for the upcoming version, because it's a rather significant change. But we could also release your change on master as a separate version, and I'll rebase on that again. I don't mind either way.