estnltk / episode-miner

GNU General Public License v2.0
5 stars 4 forks source link

Benchmark Aho-corasick VS naive search #1

Open AleksTk opened 8 years ago

AleksTk commented 8 years ago

https://pypi.python.org/pypi/pyahocorasick/1.0.0

AleksTk commented 8 years ago

Benchmarked naive/ahocorasick using vocabulary of 50K items and variable text size. Ahocorasick clearly wins: lines_in_text,method,time 100,naive,0m5.170s 100,ahocorasick,0m2.275s 1000,naive,0m29.279s 1000,ahocorasick,0m2.270s 10000,naive,4m30.309s 10000,ahocorasick,0m2.698s 100000,naive,45m40.405s 100000,ahocorasick,0m6.377