suspicious systemic caching inefficiency related to user web searches?

rahulbot commented 2 months ago

We need to start paying attention to the performance of our search system more closely. A first item I was thinking about is how I think (1) total attention, (2) attention over time, (3) language, (4) domains, (5) TLDs, and (6) sample stories right now are all being served by the same news-search-API endpoint under the hood. I think each of these ends up calling the overview query endpoint. Evidence: see news-search-api source and the number of times _overview_query is called in the mediacloud-search-api client.

We are caching the results in Django, but when a user hit search I think it's firing off ~6(!) requests in parallel from the browser->Django->ES that are all asking for those overview results at the same time, and it hasn't been cached yet the first time they search. I think this means that each user generated query from the website is causing way more work than it needs to.

Potential fixes (if I'm right):

Combined requests: Change JS widgets to make async call once and use results to render multiple widgets. However I don't think we want to do this because that kind of breaks our underlying cross-platform search model (if we ever get back to that)
Progressive load: Change JS widgets to make first attention-over-time call and then start the others once results come back (and are cached). This might also improve user experience by returning one thing quickly.
API refactor: I don't think there is a strong reason news-search-api combines all those into one call (just a historical decision), so we could refactor that to make different calls that don't overlap.
other ideas?

rahulbot commented 2 months ago

In short, this fix is necessary, but not sufficient.

More detail: I dug into the fix and understand why it isn't working. Right now the caching is done by the mc-provider using function-name, method args, and method kwargs. This is smart for cross-platform search. However, for both Media Cloud and Wayback Machine providers the various methods call the same function under the hood... so the caching isn't speeind things up because providers doesn't know that (for instance) count and sample are both calling the same thing under the hood. I'll consider alternatives and move issue to mc-providers.

rahulbot commented 2 months ago

Just-pushed changes (cache-related) make this way faster for most queries.

mediacloud / web-search

suspicious systemic caching inefficiency related to user web searches? #645