Open fabriziofortino opened 5 years ago
The Jaro Similarity of ed
and red
is 0, since the number of matching characters (parameter m
) is 0.
Furthermore, the length of the common prefix of s1
and s2
(parameter l
) is 0.
This results in a Jaro-Winkler Similarity of 0 as
sim_jw = sim_j + l * 0.1 * (1 - sim_j) = 0 + 0 * 0.1 * 1 = 0
Jaro-Winkler gives more favorable ratings to strings that match from the beginning.
when I compare 2 strings wrt jaroWinkler "abcdefghij","aaaaaaaaa" my output comes around 0.4023.....
when I check the same on https://asecuritysite.com/forensics/simstring It gives me 0.46 Kindly help in this regard.
I am trying to use jaro wrinkler similarity to check colors strings coming from user inputted form against a palette of fixed colors.
Using jaro wrinkler similarity, I get these kind of results for very short strings:
s1 = "ed"
-s2 = "red"
->similarity = 0
s1 = "nude"
-s2 = "red"
->similarity = 0.5833333134651184
Is it correct to get
similarity = 0
in the first case?