atilika / kuromoji

Kuromoji is a self-contained and very easy to use Japanese morphological analyzer designed for search
Apache License 2.0
950 stars 131 forks source link

Fix bug with overflow bits in patricia trie #127

Open emmanuellegedin opened 5 years ago

emmanuellegedin commented 5 years ago

I noticed an edge case that causes the put method of PatriciaTrie to crash. This happens when one tries to insert some string and also the same string appended with one or more U+FFFF.

For example:

PatriciaTrie<String> trie = new PatriciaTrie<>();
trie.put("a", "0");
trie.put("a\uFFFF", "1");

causes an exception.

This PR fixes the issue.