pytries / marisa-trie

Static memory-efficient Trie-like structures for Python based on marisa-trie C++ library.
https://marisa-trie.readthedocs.io/en/latest/
MIT License
1.02k stars 91 forks source link

Add Trie iter_prefixes_with_ids method to return (prefix, id) pairs #83

Closed dfuhry closed 1 year ago

dfuhry commented 1 year ago

I would like to get both prefixes and ids (from common_prefix_search) in a single pass through the trie data structure.

Trie has iteritems() method to return (prefix, id) pairs from _trie.predictive_search(). However, there is no corresponding method to return (prefix, id) pairs from _trie.common_prefix_search().

This pull request adds a Trie iter_prefixes_with_ids() method which does that.

BoboTiG commented 1 year ago

Thanks @dfuhry!

Could you rebase on master, and add at least 2 tests (one when the key is present, and one when it's not preset)?

BoboTiG commented 1 year ago

Oh, I just tilted. You need to run ./update_cpp.sh. And commit the changes.

dfuhry commented 1 year ago

Hi @BoboTiG , thank you for the help. I have added a new test_iter_prefixes_with_keys() method in tests/test_trie.py which tests the cases you requested.

I've also tried to rebase, which I guess I've done right as my branch doesn't show being behind on commits any more.

To rebuild the cpp scripts I had to change the command in update_cpp.sh from "cython" to "cython3" locally. Not sure if that creates any problems. If not, not sure if you want me to add that change to this branch.

BoboTiG commented 1 year ago

Can you rebuild using Cython 0.29.32?

BoboTiG commented 1 year ago

And no need to push the change from update_cpp.sh :)

BoboTiG commented 1 year ago

It seems good now 💪🏻 As soon as the CI is green I'll merge 👍🏻

BoboTiG commented 1 year ago

Thanks a lot for your patience @dfuhry :)

dfuhry commented 1 year ago

Thank you!

BoboTiG commented 1 year ago

I'll try to cut a new release ASAP.