Open Satrofu opened 3 years ago
The documentation is misleading when it claims that having a similarity of 1
indicates identical strings. In fact, the Dice coefficient of your two strings is 1
since they contain exactly the same bigrams, just in a different order. An even shorter example is that the Dice coefficient of the strings aba
and bab
is 1
, since in each case the two bigrams in the string are ab
and ba
.
Hello everyone, I hope all is doing good. I found a case in which there is a difference (search for <_15>), the difference is that the first string has <_15>FLL while the second one has <_15>ORD, yet the function is returning 1 as if it were a perfect match. The version used for this comparison was 4.0.1. Below you can see an example ready to be ran in node.js (system version 14.4.0):
const similarity = require("string-similarity");
const body1 = '<REQ><_0>MSG</_0><_1/><_2>55</_2><_3>ORG</_3><_4>F1</_4><_5>MIA</_5><_6>07560685</_6><_7>AC30</_7><_8>HFD</_8><_9>F1</_9><_10>T</_10><_11>US</_11><_12>USD</_12><_13>ZE</_13><_14>ODI</_14><_15>FLL</_15><_16>ORD</_16><_17>UNT</_17><_18>5</_18><_19>1</_19><_20>UNZ</_20><_21>1</_21><_22>000000</_22><_23/></REQ>';
const body2 = '<REQ><_0>MSG</_0><_1/><_2>55</_2><_3>ORG</_3><_4>F1</_4><_5>MIA</_5><_6>07560685</_6><_7>AC30</_7><_8>HFD</_8><_9>F1</_9><_10>T</_10><_11>US</_11><_12>USD</_12><_13>ZE</_13><_14>ODI</_14><_15>ORD</_15><_16>FLL</_16><_17>UNT</_17><_18>5</_18><_19>1</_19><_20>UNZ</_20><_21>1</_21><_22>000000</_22><_23/></REQ>';
console.log(similarity.compareTwoStrings(body1, body2));
Thanks!