NatLibFi / Annif

Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums.
https://annif.org
Other
196 stars 41 forks source link

Share vocabulary objects between projects #603

Closed osma closed 2 years ago

osma commented 2 years ago

Currently, each AnnifProject creates its own vocabulary (AnnifVocabulary object, which wraps a SubjectIndex) even though in reality many projects may use the same vocabulary - which is now even more likely since vocabularies are multilingual (#600). This leads to unnecessary use of RAM and also CPU when the subject index has to be recreated many times.

Instead, the vocabularies could be moved under AnnifRegistry which currently only stores projects. The vocabularies could be loaded lazily as needed using a get_vocab method, similar to the current get_vocab method in annif.vocab.