Closed rinigus closed 6 years ago
Caching unigram count sum removed the corresponding call from the flamegraph and reduced one test benchmark from 95 to 74 seconds
Commit https://github.com/rinigus/presage/commit/090d5e77d28e06a95a1def47692fa152ae458818
Using Marisa and counts file, I managed to get test run to 25s. About 15s of these are spent by SQLite learning predictor with large time probably used for flushing SQLite journal.
Closing this for now. Let's reopen if learning predictor will become an issue or some other problem will resurface
Presage has to do a lot of work and can become relatively slow. Most of the time is spent querying the database, so optimization of data access should bring major improvements.
While several LM exist optimized for getting probability full n-word sequence, Presage needs to get n-gram counts from prefix search, which makes it tricky.
The current
perf
flame diagram attached