Open klemay opened 5 years ago
Some notes on this:
We don't have access to the request
object proper deep down in the streamer
app, but I believe we have some options here. As noted in https://github.com/hypothesis/h/issues/5486, there are spots in the streamer
code that are already instantiating services directly (versus using the request.find_service
approach). The vast majority of services don't need the whole request
object. Most of them need the db session for access to the DB. We do have access to the session
in streamer
.
Unfortunately, the AnnotationJSONPresentation
service takes a boatload of init
params: https://github.com/hypothesis/h/blob/c4cd4025fa346605e84bbead7618e1613eb5d96d/h/services/annotation_json_presentation.py#L16 , each of which will need to be addressed.
I've walked through the services needed by AnnotationJSONPresentation
, and most of them can be instantiated within the streamer
code using the session
. The most challenging exception is the request.has_permission
method, which gets bandied around here. That one will take some more sleuthing.
My first-swag approach here would be along these prongs:
AnnotationJSONPresentation
service and the various formatters
used to render annotations. This part of the app is deeply designed and may benefit from some simplification. The fact that the AnnotationJSONPresentation
service takes so many init args is a little smelly.AnnotationJSONPresentation
, the most immediately significant one being has_permission
. Another spot is the links
service, which requires some path/URL stuff off of request
. An option would be not to return links
with the realtime response, though really that's a symptom of the same problem we're trying to solve here overall...streamer
's annotation responses to use AnnotationJSONPresentation
in a similar manner to h.views.api.annotations
This will require some coordination with client code as the response would change—however, I believe these changes would be additive to the current realtime response, as at the end of the day, both the API and the streamer are using the same presenter—it's just that the API's version viaa AnnotationJSONPresentation
provides more formatters
to the presenter and, as such, more JSON data is glommed on.
It's possible to do this, but it's not possible to do it well and quickly for a few reasons:
I have a path for fixing all of this, but it's not small, and I wanted to find out what the priority of all that is. Seeing as this is a spike, I think that's all that's required.
My suggestions for fixing this (as refactoring steps, not necessarily as PRs):
formatters
and the presenter
and the service
into a single service directory (they are all private to it)user is None
check in the flagged service is_moderator
check instead of doing it two different waysThe final call in websocket could then look something like this:
# Precache what you have to here:
presenter = presentation_service.get_presenter_for(annotations=[annotation], users=...)
# Iterate quickly here
for user in users:
serial = presenter.present(annotation, user)
You could add convenience methods for other common cases.
create()
call outside itExample PR: https://github.com/hypothesis/h/pull/6316
This works, but it's slower than it was before we started the optimisation work (i.e. this + optimisation = slower than before). I don't know if this will create a problem, as we rely on the fact that relatively few sockets are interested in any given message. So you might get away with it. You might also get really badly killed if we got sufficient users on a single page (a large active course or popular website). This is relatively unlikely I'd say, as any rate of annotations which might cause us trouble would probably too much for people to look at, but who knows?
If we are expecting 10x or 100x the people, this is going to be a concern.
History & notes can be found here: https://github.com/hypothesis/h/issues/5486