Open ConnorNelson opened 1 year ago
We have achieved a more than 2x speedup with 8b7920f23cfe3a890adcf6ad7d6817fa0e57e112 and d2e0cd627b55adae209c99fe3033975023ae0ea9.
In [51]: dojo = Dojos.from_id("cse365-s2023").first()
...: %time solves(None, dojo=dojo).count()
...: %time solves_old(None, dojo=dojo).count()
CPU times: user 10.6 ms, sys: 3.95 ms, total: 14.6 ms
Wall time: 1.2 s
CPU times: user 10 ms, sys: 0 ns, total: 10 ms
Wall time: 2.75 s
Unfortunately, this has also resulting in a <5% degradation in performance to computing solve times for Dojos which do not have visibility requirements.
I would still like to see this be improved. Ideally (and I think reasonably?) we should get solve queries to <100ms in the worst case.
For computing the profile, we might need to refactor our solves calculations to not perform a solves check on each dojo individually and instead compute it all together. However, this might be tricky since we are getting ranking information, which means we can't just filter for the user; we need to compute everyone's scores and find the user's rank.
SELECT SQL_NO_CACHE COUNT(*)
FROM submissions INNER JOIN solves ON submissions.id = solves.id INNER JOIN dojo_challenges ON dojo_challenges.challenge_id = solves.challenge_id LEFT OUTER JOIN dojo_users ON dojo_users.user_id = solves.user_id AND dojo_users.dojo_id = dojo_challenges.dojo_id AND dojo_users.type != "admin" INNER JOIN dojos ON dojos.dojo_id = dojo_challenges.dojo_id AND (dojos.official = 1 OR dojo_users.user_id IS NOT NULL) LEFT OUTER JOIN dojo_challenge_visibilities ON dojo_challenge_visibilities.dojo_id = dojo_challenges.dojo_id AND dojo_challenge_visibilities.module_index = dojo_challenges.module_index AND dojo_challenge_visibilities.challenge_index = dojo_challenges.challenge_index
WHERE 62725971 = dojo_challenges.dojo_id;
1 row in set (0.869 sec)
SELECT SQL_NO_CACHE COUNT(*)
FROM solves INNER JOIN dojo_challenges ON dojo_challenges.challenge_id = solves.challenge_id LEFT OUTER JOIN dojo_users ON dojo_users.user_id = solves.user_id AND dojo_users.dojo_id = dojo_challenges.dojo_id AND dojo_users.type != "admin" INNER JOIN dojos ON dojos.dojo_id = dojo_challenges.dojo_id AND (dojos.official = 1 OR dojo_users.user_id IS NOT NULL) LEFT OUTER JOIN dojo_challenge_visibilities ON dojo_challenge_visibilities.dojo_id = dojo_challenges.dojo_id AND dojo_challenge_visibilities.module_index = dojo_challenges.module_index AND dojo_challenge_visibilities.challenge_index = dojo_challenges.challenge_index
WHERE 62725971 = dojo_challenges.dojo_id;
1 row in set (0.482 sec)
The former joins in the submissions table, while the latter does not. We only need the submissions table to determine when the challenge was solved.
Unfortunately, we cannot just include submissions
instead of solves
:
SELECT SQL_NO_CACHE COUNT(*)
FROM submissions INNER JOIN dojo_challenges ON dojo_challenges.challenge_id = submissions.challenge_id LEFT OUTER JOIN dojo_users ON dojo_users.user_id = submissions.user_id AND dojo_users.dojo_id = dojo_challenges.dojo_id AND dojo_users.type != "admin" INNER JOIN dojos ON dojos.dojo_id = dojo_challenges.dojo_id AND (dojos.official = 1 OR dojo_users.user_id IS NOT NULL) LEFT OUTER JOIN dojo_challenge_visibilities ON dojo_challenge_visibilities.dojo_id = dojo_challenges.dojo_id AND dojo_challenge_visibilities.module_index = dojo_challenges.module_index AND dojo_challenge_visibilities.challenge_index = dojo_challenges.challenge_index
WHERE 62725971 = dojo_challenges.dojo_id AND submissions.type = "correct";
1 row in set (0.943 sec)
Probably, we are missing some indexes somewhere. This has the net effect of profiles timing out, and is probably also the source of why stats were slow and needed to be disabled.