zopefoundation / Products.CMFCore

Key framework services for the Zope Content Management Framework (CMF)
Other
6 stars 19 forks source link

Expensive reindex when local roles are deleted #140

Open ewohnlich opened 3 weeks ago

ewohnlich commented 3 weeks ago

https://github.com/zopefoundation/Products.CMFCore/blob/master/src/Products/CMFCore/MembershipTool.py#L457 This method handles local role removal in a variety of cases but I am probably interested in the process of deleting a user. CMFPlone.controlpanel.browser.usergroups_usersoverview calls this method with recursive=True. So what appears to be happening is that the entire site is traversed when a user is deleted and each of those objects calls reindexObjectSecurity. This happens even if the user has no local roles.

What I did:

Deleted a user with no local roles

What I expect to happen:

Receiver a response in a reasonable amount of time.

What actually happened:

Response took a VERY long time to be received.

What version of Python and Zope/Addons I am using:

None

Proposal

  1. I think the reindex section can be moved into this condition https://github.com/zopefoundation/Products.CMFCore/blob/master/src/Products/CMFCore/MembershipTool.py#L463 so that it only happens if there are actually local roles to delete. This saved a significant amount of time in my testing.
  2. The above still requires traversing and waking up every object on the site. Even better would be to make a catalog query to get just the content with local rules for the user. I am not sure if this is currently possible. And although the catalog is there for Plone, I'm not sure if CMFCore can assume such a catalog exists.
ewohnlich commented 3 weeks ago

It is slightly more complicated than I first understood. There is actually only one reindex call (and it's for the portal root), because the recursive calls to itself set reindex=0. However, reindexing on the portal root is still a very expensive process since reindexObjectSecurity is itself recursive. We still don't need to do call this in a lot of cases.

I think something like this may work, with my changes noted in comments

       if _checkPermission(ChangeLocalRoles, obj):
            for member_id in member_ids:
                if obj.get_local_roles_for_userid(userid=member_id):
                    obj.manage_delLocalRoles(userids=member_ids)
                    # change - move the reindex under this condition
                    if reindex and hasattr(aq_base(obj), 'reindexObjectSecurity'):
                        # reindexObjectSecurity is always recursive
                        obj.reindexObjectSecurity()
                    break

        if recursive and hasattr(aq_base(obj), 'contentValues'):
            for subobj in obj.contentValues():
                # change - keep reindex value
                self.deleteLocalRoles(subobj, member_ids, reindex, 1)
d-maurer commented 3 weeks ago
  1. I think the reindex section can be moved into this condition https://github.com/zopefoundation/Products.CMFCore/blob/master/src/Products/CMFCore/MembershipTool.py#L463 so that it only happens if there are actually local roles to delete. This saved a significant amount of time in my testing.

Be very careful! Local roles affect not only the object where they are assigned but also all descendants. This implies that deleting a local role may require reindexObjectSecurity on all descendants. Not doing this can lead to inconsistencies in the the security related indexes (usually allowed_roles_and_users).

In the use case of user deletion, the inconsistency will not have an immediate effect. But if a user should in the future be created with the same id (as the one deleted), then his searches may find objects he cannot access.

For the reason above, I am against calling reindexObjectSecurity only if local roles are assigned at this object.

Should your application require something like this, I suggest to implement your own MemberShipTool which handle local role deletion in a way acceptable for your concrete situation.

ewohnlich commented 3 weeks ago

But shouldn't that be accounted for in my proposal? It would run a reindex on any object that has local roles and all of its descendants. This is due to reindexObjectSecurity always being recursive - if you call only portal.reindexObjectSecurity() it will actually update the security indexes of brain in the catalog.

d-maurer commented 3 weeks ago

Eric Wohnlich wrote at 2024-6-18 08:34 -0700:

But shouldn't that be accounted for in my proposal? It would run a reindex on any object that has local roles and all of its descendants. This is due to reindexObjectSecurity always being recursive - if you call portal.reindexObjectSecurity it will actually update the security indexes of everything in the catalog.

Maybe. Your proposal so far has not yet been precise.

When I remember right, then reindexObjectSecurity is itself recursive. If we call it on obj (after all local roles on obj and its descendants have been updated) we do not need to to call it on descendants.

This suggests the following approach:

If a user has created a content object, then he will be listed in its "Owner" local role setting. Thus, the approach will still reindexObjectSecurity on many unnecessary objects.

We can refine the approach by passing a reindex_required down into the tree traversal. reindex_required is set at an object, if local roles have to be updated at the object. When we go up, we call reindexObjectSecurity when the local reindex_required is True and the passed in reindex_required has been false -- this is a top level object where lcal role changes may affect the object and its descendants.