WojciechMula / pyahocorasick

Python module (C extension and plain python) implementing Aho-Corasick algorithm
BSD 3-Clause "New" or "Revised" License
948 stars 125 forks source link

add a new alternative implementation #64

Closed Guangyi-Z closed 7 years ago

Guangyi-Z commented 7 years ago

Feel free to check the "Background" section on this new project for the detailed reasons.

WojciechMula commented 7 years ago

Thanks. I see that lack of Unicode support in my module forced you to write own module. Have you seen #40? There's a workaround.

pombredanne commented 7 years ago

FWIW, I also forked a pure Python alternative in here: https://github.com/nexB/license-expression/blob/master/src/license_expression/_pyahocorasick.py that deals with unicode. I am still using the full, standard version in https://github.com/nexB/scancode-toolkit/ of course, where speed matter a lot and the C version does wonders!