morfologik / morfologik-stemming

Tools for finite state automata construction and dictionary-based morphological dictionaries. Includes Polish stemming dictionary.
BSD 3-Clause "New" or "Revised" License
187 stars 44 forks source link

DictionaryLookup thread safe #108

Closed kosaa closed 4 years ago

kosaa commented 4 years ago

Is there needed a lot of work for makes the DictionaryLookup thread safety?

I created mvp project and my lookup method takes 60% cpu because of synchronization is needed.

Or maybe my code isn't optimal and there is better way to get WordDatas list.

private List<WordData> lookup(String word) {
    List<WordData> list = new ArrayList<>();
    synchronized (lock) {
        for (WordData wordData : dictionaryLookup.lookup(word)) {
            list.add(wordData.clone());
        }
    }
    return list;
}
dweiss commented 4 years ago

Where is the above snippet of code from?

DictionaryLookup isn't thread-safe but is lightweight. You should be creating a dictionary lookup for use per-thread (or even on-demand) and it should be cheap. The underlying FST (Dictionary) is thread-safe and can be loaded and shared (once published safely, if you're running in a non-locking environment).