WojciechMula / pyahocorasick

Python module (C extension and plain python) implementing Aho-Corasick algorithm
BSD 3-Clause "New" or "Revised" License
929 stars 122 forks source link

how to delete a string in the trie? #79

Closed huiyuHarvey closed 5 years ago

huiyuHarvey commented 6 years ago

It seems that there's no delete interface.

pombredanne commented 6 years ago

@huiyuHarvey that's correct. What would be your use case?

The typical usage is that you build a trie then finalize it as an automaton that can no longer be modified. If it needs to be modified, you rebuild the trie from scratch.

WojciechMula commented 6 years ago

@huiyuHarvey It's not possible now. BTW you are the first who requests such feature.

matanox commented 6 years ago

Does the algorithm actually exclude dynamically modifying the trie/automaton? sometimes things are very large-ish and dynamic, and this saves the trouble of rebuilding a new automaton for mimicking the removal of some keys/words. Curious.

pombredanne commented 6 years ago

@matanster there is nothing that prohibits this in the Aho-Corasick construction... but doing it is much much more complex IMHO

WojciechMula commented 6 years ago

@matanster I don't know any paper/article which describes such a modification. I believe it is possible at lower cost than rebuilding the whole automaton.

WojciechMula commented 5 years ago

@matanster I still don't know the way to remove word from a trie without rebuilding Aho-Corasick automaton. But MR I prepared at least allows to remove elements from a trie.