christian-westbrook / interactive-sentence-predictor

An interactive sentence predictor that generates likely continuations of partial sentences.
0 stars 0 forks source link

Exploring ways to use HashMaps more efficiently #6

Open rfishe02 opened 6 years ago

rfishe02 commented 6 years ago

I developed an approach that uses a HashMap and a Priority Queue to improve the lookup speed of n-grams in the application.

I hashed the first half of the n-gram, and built a priority queue for the latter half of the n-gram.

For bigrams, it looks like this:

"to" --> GramFreq{ String val; int freq; }

We can look up the initial n-gram fairly quickly, then quickly select the highest n-gram from the priority queue. I think that this approach is an improvement over the initial idea I had, where we would use a linked list to store the latter half of the n-grams.

It seems fast, but it takes a while to preprocess the data and load it into the application.

christian-westbrook commented 6 years ago

I’d like to comment that it is acceptable to spend more time in the preprocessing stage if it improves system performance at runtime without affecting accuracy. I am in support of testing this method or storage against the current system and comparing runtime performance.