compenguy / ngrammatic

A rust crate providing fuzzy search/string matching using N-grams
MIT License
25 stars 7 forks source link

Let warp parameter be changed in search/matches functions #3

Closed Hugo-C closed 2 years ago

Hugo-C commented 2 years ago

Hi, currently the warp parameter for Ngram::similarity_to is hard coded to 2. And so can't be changed when doing a Corpus::search. I'd like to set it to 1 (to have a Jaccard distance).

Does changing the functions' signatures would be ok for you ? Another option, to prevent compatibility break, is to keep this parameter in Corpus/Ngram struct (with a default value of 2).

Whatever the choice you prefer, I am willing to make a PR for it. thanks in advance.

compenguy commented 2 years ago

I have a PR (#4) up. Take a look and let me know if that provides what you're looking for. I think I have preferred to make the default matches use the Jaccard distance, but besides being a breaking change, would be an invisible, hard to detect change for everyone using the existing search()/matches().

That's not to say breaking changes are off the table. I have an v0.4 planned (#5) which removes arity as a struct member and replaced it with a const generic, but I'm not in a huge hurry to roll it out unless there are other breaking changes people would like to see.

So if you could drop in on PR #4, let me know if that suits, or if you have suggestions for improvement, I can get this issue sorted out for you. If you'd prefer something a little more disruptive, we can bump to 0.4 and I can roll it out at the same time as the changes in #5

Hugo-C commented 2 years ago

This is what I need indeed, thanks. I don't mind breaking changes for this library as I don't have much existing code 😄 Anyway it looks good to me as it is, so I'll use your warp-control branch to continue work. Feel free to group both PR in a single release

compenguy commented 2 years ago

Fixed in v0.3.4, which should be live on crates.io now.

compenguy commented 2 years ago

Feel free to group both PR in a single release

I'm going to hold the const-arity branch work (PR #5) while I figure out where the const generics work is, and how likely my two open concerns are to be resolved in any useful timeframe.