While looking at ways to implement #735, I discovered an opportunity for optimization in the registry code that handles loading of vocabularies. For some reason (probably my mistake) the registry loads vocabularies multiple times, once per language. This amounts to useless work and use of memory.
This PR adjusts the code slightly so that vocabularies are always loaded just once. This was always the intention since the introduction of multilingual vocabularies (#559, PR #600 etc.) and especially PR #610 which implemented vocabularies that are shared between projects.
I benchmarked this with an installation where I have three Finto AI MLLM projects (languages fi, sv, en) that all use the YSO vocabulary, but in different languages. I ran the command
The idea here is to use ProductionConfig which causes all projects to be loaded on startup, instead of on demand. This means that also the vocabulary is loaded.
Before
(showing selected stats)
User time (seconds): 13.04
System time (seconds): 6.13
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:12.27
Maximum resident set size (kbytes): 539600
After
User time (seconds): 12.82
System time (seconds): 7.26
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:11.66
Maximum resident set size (kbytes): 428940
So there's a slight speedup, and the memory usage drops by 110MB. Not bad for a patch that also reduces the amount of code by 3 lines.
While looking at ways to implement #735, I discovered an opportunity for optimization in the registry code that handles loading of vocabularies. For some reason (probably my mistake) the registry loads vocabularies multiple times, once per language. This amounts to useless work and use of memory.
This PR adjusts the code slightly so that vocabularies are always loaded just once. This was always the intention since the introduction of multilingual vocabularies (#559, PR #600 etc.) and especially PR #610 which implemented vocabularies that are shared between projects.
I benchmarked this with an installation where I have three Finto AI MLLM projects (languages fi, sv, en) that all use the YSO vocabulary, but in different languages. I ran the command
The idea here is to use ProductionConfig which causes all projects to be loaded on startup, instead of on demand. This means that also the vocabulary is loaded.
Before
(showing selected stats)
After
So there's a slight speedup, and the memory usage drops by 110MB. Not bad for a patch that also reduces the amount of code by 3 lines.