takawitter / trie4j

PATRICIA, Double Array, LOUDS Trie implementations for Java
Apache License 2.0
174 stars 31 forks source link

ArrayIndexOutOfBoundsException on lookup in MapDoubleArray #31

Closed amake closed 7 years ago

amake commented 7 years ago

Hi. I am using trie4j to store natural-language dictionary entries. With a particular dictionary index and a particular lookup word I am getting an ArrayIndexOutOfBoundsException on MapDoubleArray.get(). Sample code:

MapTrie<Object> mpt = new MapPatriciaTrie<>();
Files.lines(Paths.get("dicttest-Harrap_s_Business_fran_ais_ang.idx.txt"))
        .forEach(word -> mpt.insert(word, null));
MapDoubleArray<Object> da = new MapDoubleArray<>(mpt);
System.out.println(mpt.get("you")); // OK
System.out.println(da.get("you")); // Throws exception

The test file loaded above is available here: https://gist.github.com/amake/ba10166a7bf59ffac180defcecbec8d4

The lookup word "you" is not contained in the index, so I expect both get() calls to return null; however the second one results in this exception:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 128198
    at org.trie4j.doublearray.DoubleArray.getNodeId(DoubleArray.java:212)
    at org.trie4j.doublearray.DoubleArray.getTermId(DoubleArray.java:220)
    at org.trie4j.AbstractTermIdMapTrie.get(AbstractTermIdMapTrie.java:154)
    at test.main(Test.java:13)

This occurs with version 0.9.4.

takawitter commented 7 years ago

Thanks @amake ! I confirmed the issue and reproduced the problem with following data extracted from yours:

seu o
yen

This issue will be fixed in a day.

takawitter commented 7 years ago

0.9.5 https://github.com/takawitter/trie4j/releases/tag/0.9.5 released. It takes a while to be available at maven central repository.

amake commented 7 years ago

It is available for me already, and I can confirm that the issue is fixed. Thanks for the very fast response!