Bergvca / string_grouper

Super Fast String Matching in Python
MIT License
362 stars 76 forks source link

Ngram re-use #92

Open hyshandler opened 1 year ago

hyshandler commented 1 year ago

I'm building an app that runs match_strings with user-entered strings and a static set of strings. The static set of strings is stored in a feather file and pulled in via pandas each time the app is used and then chunked into ngrams of 3. It's a fairly large dataset so this takes some time. Since I'm using the same set of strings with the same ngram every time, I'm wondering if there's a way to save all of the ngrams in a file and simply feed those into a fuzzy match against the user-entered strings.

Thanks in advance!

hyshandler commented 10 months ago

Wanted to follow up on this. Thanks!