Open soliverc opened 5 years ago
I ran the package again today and scores range from -1.4 to +2.5. I can't figure it out!
I agree, not sure how to read the scores.
Just to reiterate that it would be great to get an idea of what the scores mean so that a comparison could be made between various matching algorithms/libraries. I find this library vastly quicker for large data sets so it's shame that this is one of the main drawbacks.
how can we change the scorer to return the true probability of a match??
On what scale are the matches scored?
I noticed with
fuzzymatcher.fuzzy_left_join
mybest_match_score
ranges from -0.7 to + 1.15.What is the highest possible score in this case? Can it go higher than 1.15?
Usually for fuzzy matching I would have a cutoff of around 0.8 or 0.9., which is on a scale of 0 to 1.