Closed woodbri closed 8 years ago
Issue: the Lexicon entries can be multiple words but the current Tokenizer does not take that into account.
Potential solutions:
Item 2. can be optimized by grouping Lexicon phrases based on 1-2 initial chars of the phrase and only cycling thru those and each token pass.
closed with push 811c997..b593965. sStill needs some more testing with a lexicon, but work with an empty lexicon.
Issue: the Lexicon entries can be multiple words but the current Tokenizer does not take that into account.
Potential solutions:
Item 2. can be optimized by grouping Lexicon phrases based on 1-2 initial chars of the phrase and only cycling thru those and each token pass.