WojciechMula / pyahocorasick

Python module (C extension and plain python) implementing Aho-Corasick algorithm
BSD 3-Clause "New" or "Revised" License
927 stars 122 forks source link

iter with longest_prefix match #108

Closed Huarong closed 2 years ago

Huarong commented 5 years ago

now:

In [10]: import ahocorasick

In [11]: >>> for idx, key in enumerate('he her hers she'.split()):
    ...: ...   A.add_word(key, (idx, key))
    ...:

In [12]: A.make_automaton()

In [13]: list(A.iter('he her'))
Out[13]: [(1, (0, 'he')), (4, (0, 'he')), (5, (1, 'her'))]

expect:

In [13]: list(A.iter('he her', longest_prefix=True))
Out[13]: [(1, (0, 'he')), (5, (1, 'her'))]
snoopyjc commented 4 years ago

Yes, I would love this!

snoopyjc commented 4 years ago

Is this the same as iter_long?

WojciechMula commented 4 years ago

@snoopyjc Yes, that's the same. Personally I'm rather for a new method (now iter_long) than an extra argument. I feel like overloading is not always the best way.

zhouxinhit commented 4 years ago

Is this the same as iter_long?

I cannot find iter_long function in 1.4.0 version

pombredanne commented 2 years ago

This has been implemented in version 1.4.2. I am closing this now. Thank you all! :heart: See the doc at https://pyahocorasick.readthedocs.io/en/latest/ for details