hypothesis / product-backlog

Where new feature ideas and current bugs for the Hypothesis product live
118 stars 7 forks source link

API search should only return readable annotations #95

Closed chdorner closed 7 years ago

chdorner commented 7 years ago

This is part of the larger work (#92) which tracks the complete read permission implementation.

This part is to make sure that we return only readable annotations when searching.


Memex currently adds an auth filter on every query, this is implemented by adding a terms filter on the permissions.read field with the effective principals of the current request (and group:__world__). See here for the code.

We could continue doing exactly that and making sure that the read permissions in the index are set correctly. But this means that if the group's readable_by value changes, we would need to reindex all of the annotations. Which is not a feasable option.

As described in #92, the access control of an annotation is split up into two parts, the access control of the annotation itself with the value of the shared field, and the access control of the group with its readable_by value. These two parts live in different parts of the code, the annotation's access control lives in memex, while the groups access control is desine in h. We can use this way of thinking as the solution to this problem.

If we change the AuthFilter inside memex to only filter out private annotations that are not owned by the current user, we handle the annotation-part of the access control inside memex. This means that we will have to add the shared field to the search index.

And to apply the group access control to search queries we can hook a search filter into memex (as we already do for nipsa), which will add a terms filter on the group field and filter by all groupids the user is allowed to see. This list contains all the groups the user is a member off, as well as every group in the database that has readable_by set to world. A word of caution here is that this will probably work until we have a significant number of publisher groups in our database (possibly a few hundred, or thousands).

With these two changes we are only returning readable annotations from ElasticSearch back to Python and then back to the client. In addition, we can remove the permission field from the search index, because queries only use the (new) shared field, and the existing group field.


Done when:

chdorner commented 7 years ago

fyi @ajpeddakotla, I'm pausing this to get #106 done, because we will have to reindex all annotations at least once to get this done.

chdorner commented 7 years ago

This has been long done and deployed, closing the issue.