Closed tfnribeiro closed 1 week ago
Doing everything in the DB and also adding indexes if needed should definitely be the way to go :)
Yes, I will look into it - I will try to write the query in SQLAlchemy and see how it improves the query speed,
I have worked on this further and I got to the point where I have reduced the time of the query from:
[Thu Oct 24 11:05:16.150729 2024] [wsgi:error] [pid 9:tid 140594214651584] [remote 172.19.0.1:42276] ### INFO: all_bookmarks_priority_to_study
took: 14.7286 seconds, total: 1802
to:
[Thu Oct 24 11:08:21.888770 2024] [wsgi:error] [pid 9:tid 139678095840960] [remote 172.19.0.1:47454] ### INFO: all_bookmarks_priority_to_study
took: 1.7743 seconds, total: 131
I did this by limiting the number of queried bookmarks to be the count asked from the endpoint , so at max it can be limit * 2 (because we get the top scheduled and top unscheduled words) and do a final sort.
This seems enough for our purposes, I'd say - what do you think?
one order of magnitude improvement. nobody can complain about that :)
When we changed the scheduled bookmarks, I didn't look too much into the performance of the
top_bookmarks_to_study
but I was testing a user with about 1900 total bookmarks, and it takes around 30 seconds for us to render the first exercise in the web, which made me wonder what are the total number of "fit to study" bookmarks we are working with.It does seem that our average user has 105 bookmarks, but there is quite a few (> 300) that have more than 200 bookmarks.
This makes me think that it might be better to limit the query and order it already from the database, rather than doing it in python. I would expect that to improve the speed of the query overall.