hypothesis / h

Annotate with anyone, anywhere.
https://hypothes.is/
BSD 2-Clause "Simplified" License
2.96k stars 427 forks source link

Base security predicates on users memberships #9074

Closed seanh closed 2 days ago

seanh commented 1 week ago

Problem: four groups-related security predicate functions are triggering DB queries that unnecessarily fetch all of a group's memberships and members from the DB. For example:

@requires(authenticated_user, group_found)
def group_has_user_as_owner(identity, context):
    return any(
        owner.id == identity.user.id
        for owner in context.group.get_members(GroupMembershipRoles.OWNER)
    )

The call to context.group.get_members() (models.Group.get_members()) will iterate over the models.Group.memberships relationship which will load all of the group's memberships from the DB. Because of eager-loading on the memberships relationship this will also load all of the group's members (i.e. the models.User objects) from the DB as well, though it will at least avoid doing a separate query for each member.

The ideal solution would be to do a smaller DB query to check whether a given user is an owner of a given group without loading all the group's members or memberships, something like this:

select(GroupMembership)
.where(GroupMembership.group == group)
.where(GroupMembership.user == user)
.where(GroupMembership.roles.contains([GroupMembershipRoles.OWNER])

But this can't easily be done because neither Group.get_members() or the security predicate functions that call it have access to the DB session. I've tried a few different ways of refactoring the code to make it possible to do a query like the above but ran into difficulties, see: https://github.com/hypothesis/h/pull/9064#discussion_r1838434001

So this commit implements an alternative approach: instead of iterating over the group's memberships iterate over the user's memberships looking for one with a matching group and role. This should be much more efficient because users typically have far fewer memberships than groups do, see: https://hypothes-is.slack.com/archives/C4K6M7P5E/p1731507180110099?thread_ts=1729159093.383279&cid=C4K6M7P5E

The LongLivedUser.groups field is removed and replaced with LongLivedUser.memberships because predicates need to make decisions based on a user's roles and groups don't have roles, memberships do.

LongLivedUser.from_model(user) has to be changed to create a LongLivedUser with a list of LongLivedMembership's based on user.groups which is a little more complicated so I also extended the unittest for this method.

The group_has_user_as_owner(), group_has_user_as_admin(), group_has_user_as_moderator() and group_has_user_as_member() security predicates are then rewritten based on identity.user.memberships (the new LongLivedUser.memberships attribute) instead of calling Group.get_members().

The Group.get_members() method is removed because it's no longer used.

Testing