zeeguu / api

API for tracking a learner's progress when reading materials in a foreign language and recommending further personalized exercises and readings.
https://zeeguu.org
MIT License
8 stars 24 forks source link

Script for recomputing FK scores for languages with new constants #250

Closed gustavkrist closed 1 month ago

gustavkrist commented 1 month ago

I've added a script to recompute FK scores for all the languages that just had new constants added, similar to https://github.com/zeeguu/api/blob/master/tools/old/recompute_fk_difficulties_for_polish.py.

mircealungu commented 1 month ago

looks good to me. @tfnribeiro - can you also have a quick look?

tfnribeiro commented 1 month ago

I also think it's good!

I would just add a checkpoint so commit every 1000 articles or so - just so we don't have a really large transaction. One more thing, might be to not print every article, but just have a flag, as it will be a lot of prints. Maybe it can be a flag, so if we were only updating a few articles it would be nice to see how they are changing.

tfnribeiro commented 1 month ago

{8E01A83B-2CD9-4A79-B3CF-FD235C01966E}

In my DB with the latest dump it takes about 20 minutes to run.

tfnribeiro commented 1 month ago

I was attempting to run it in my environment and the process seems to get killed about a 1/3 into the process.

{B5AEBBB1-C57B-4073-931C-1EC001988DCF}

Checkpointing in-between seems to allow the process to continue past that point:

{DC4FDF23-6EAF-4A4B-B4B0-196DC35A3513}

gustavkrist commented 1 month ago

It ran very fast on the dump I got, I did not consider the scale of the full production database. I'll add checkpoints if needed, but it sounds like you've already done so.

tfnribeiro commented 1 month ago

Alright, I will make a commit to add the checkpoint commits. I just checked that it completed while checkpointing commits, taking a total of 45 minutes.

{DD6C182A-B986-4CAE-AA8F-59A8F159A525}

tfnribeiro commented 1 month ago

I have pushed the changes, essentially I removed the with (as it seems like it automatically closes once you commit), added two constants so they can be edited if we want to see the prints and the checkpoint step.