Add an option to make a warm-up of lower/upper KB databases at startup

kermitt2 / entity-fishing

A machine learning tool for fishing entities

Apache License 2.0

249 stars 24 forks source link

Hi Patrice,

Not sure if we should preload the whole dbs (too much memory involved) or just a subset like the N most frequently used entries ? Looking at the code of com.scienceminer.nerd.utilities.WikipediaLabelIDF I can see that you already have the occurence count stored in the LabelDatabase so for this one it should be easy

But don't know how to proceed for the others (PageDb, etc...)

Maybe a persistant EHCache with an LFU policy could be associated to every KBDatabase so that the N most frequent entries (just the key anyway) could be stored and retrieved at startup ?

Just some thoughts

Best regards

Olivier

kermitt2 / entity-fishing

Add an option to make a warm-up of lower/upper KB databases at startup #143