hankcs / AhoCorasickDoubleArrayTrie

An extremely fast implementation of Aho Corasick algorithm based on Double Array Trie.
http://www.hankcs.com/program/algorithm/aho-corasick-double-array-trie.html
946 stars 289 forks source link

OOM when building dat/acdat. Compared with hashmap, DAT consumes less memory. Why hashmap of 100000000 docs can be build, while DAT with 10000000 docs leads to OOM? #39

Closed gaohang closed 4 years ago

gaohang commented 4 years ago

Why hashmap of 100000000 docs can be build, while dat with 10000000 docs leads to oom?