WikiEducationFoundation / WikiEduDashboard

Wiki Education Foundation's Wikipedia course dashboard system
https://dashboard.wikiedu.org
MIT License
392 stars 631 forks source link

[Data rearchitecture] Try to speed up article course cache updates #6037

Closed gabina closed 6 days ago

gabina commented 6 days ago

What this PR does

After running some course updates on the data-rearchitecture instance for a heavy course (Pharmacology_at_Bar_IlanUniversity-_2024), I noticed that updating the caches for every article course takes ~ 40 minutes. According to production, Pharmacology at Bar Ilan University - 2024 course has 63.4K articles courses. With that number of articles courses, I think it’s reasonable for the ArticlesCourses.update_all_caches_from_timeslices(@course.articles_courses) step to take a lot of time. While updating caches for a specific article course is not a heavy task, we need to retrieve all its timeslices, and with 63.4K articles courses, this means (at least) 63.4k queries.

In order to try to speed up the cache update for articles courses I'm trying the following strategy:

Open questions and concerns

As I'm not entirely sure this is going to work, I commented some lines of code (didn't remove anything). After testing this in my instance with heavy courses, I'll clean the code.