CentreForDigitalHumanities / I-analyzer

The great textmining tool that obviates all others

https://ianalyzer.hum.uu.nl

MIT License

7 stars 2 forks source link

Word models: loading for every request inefficient #900

Open BeritJanssen opened 1 year ago

BeritJanssen commented 1 year ago

Currently, for every request, we reload the word models from source files. This may decrease performance - perhaps we want to preload the models when the word_models path is requested for a given corpus for the first time, and then persist the object, so it can be directly queried. Options to do so:

use the current_app context (see this thread)
use a separate service (perhaps even one per corpus) which is only in charge of computing similarities of word models

@jgonggrijp , @JeltevanBoheemen , @oktaal any other ideas or suggestions?

jgonggrijp commented 1 year ago

Nope, I think you already covered the options.

oktaal commented 1 year ago

Maybe caching could make this problem less severe but your options are a more structural solution.

lukavdplas commented 1 year ago

As a short term solution, it may also be worth it to go over the code (e.g. #901). The word model views were not developed for models with such a significant loading time, so they are not necessarily efficient.

Anyway, I just wanted to point out that we are planning to convert the backend to django, so perhaps hold off on a solution that is very flask-specific.

lukavdplas commented 11 months ago

1161 would also be an alternative.