Open philkoch opened 1 year ago
You are using the library correctly but it seems that the min_similarity
was not implemented properly for all cosine similarity backends. I will make sure this gets fixed a next release. For now, if you want to use this feature, you can do it with:
pip install polyfuzz[fast]
I will try that, thanks for the quick response!
Hello Maarten, Whichever model I use with Polyfuzz, the model parameters are never applied. Is there any workaround for this ?
Thanks, Nitin
@nitindabadghav Could you provide a bit more information? What version do you use? Can you share your code? Have you tried the answer I provided above? Etc.
When using the
TFIDF
model themin_similiary
parameter seems not to be applied to the results.Minimal Example that reproduces the problem (polyfuzz 0.4.0):
When running the code the following output is generated, but the rows 4 and 7 should have a Similarity score of 0, if I understand the documentation correctly.
I would expect the rows with a Similarity of < 0.9 to have a Similarity of 0 and a
To
value of None.Output:
In case I'm using the library wrong, how would I be able to get only results with a similarity higher than
0.9
?