Closed btbonval closed 9 years ago
All MIT courses with a department_id
also have a school_id
, although they were meant to be mutually exclusive as school_id
is on its way out. In fact, no MIT courses are registered without a department. This is probably the source of double counting.
For MIT, there are 3,867 accumulated file_count
files across all courses. This is almost precisely half what is reported on the scoreboard 7,772. Unsure why it is not precisely half.
Just for confirmation: the cached file_count
on course is being used, but that is updated just before school counts are updated. So this looks okay.
https://github.com/FinalsClub/karmaworld/blob/master/karmaworld/apps/courses/tasks.py#L15-L17
https://github.com/FinalsClub/karmaworld/blob/master/karmaworld/apps/courses/models.py#L314-L317
Need to iterate over a union of courses pulled from school and department, not over a list. That should remove duplicates and correct the school cache counts.
Changing to set removed duplicates and reduced MIT's scoreboard count successfully on my virtual machine. Adding up all values on the scoreboard yielded the exact total of notes on the front page, so the numbers were made consistent. Looks like this is the correct fix.
On production, there are 5024 notes in the database, but the scoreboard says MIT has 7000some.
The scoreboard makes use of cached counts (which could be a problem). https://github.com/FinalsClub/karmaworld/blob/master/karmaworld/apps/courses/views.py#L133-L134
Update note count is called as a task, which runs this code for the school note counts: https://github.com/FinalsClub/karmaworld/blob/note-editing-merge-more/karmaworld/apps/courses/models.py#L91-L102
The caches are updated once every 24 hours: https://github.com/FinalsClub/karmaworld/blob/master/karmaworld/settings/prod.py#L96-L99
Best guess is that the cache updater is double counting notes, which might happen from separately counting notes for schools and notes for departments.