Closed ferreiro closed 2 years ago
Sift4 has a direct functionality on the blog post announcing it: https://siderite.dev/blog/super-fast-and-accurate-string-distance.html/ where you can test the distance between two strings. It was impossible to reproduce the demo you made in the video here.
I would be glad to work with you on this, because I don't see any reason why yaho would be closer to zoho than yahoo, especially in an implementation of levenstein, which is not fuzzy at all :)
Also, even if you use Sift3, please update the URL to the blog in the comment of the source. It is now https://siderite.dev/blog/super-fast-and-accurate-string-distance-sift3.html/
Let me know if I can help in any way. Thanks!
On the demo provided at the blog post, I couldn't get yaho
closer to zoho
than to yahoo
at all.
yahoo | zoho | |
---|---|---|
Levenstein | 1 | 2 |
Sift3 | 0.5 | 2 |
Sift4 | 1 | 2 |
Which version do you use on ZooTools, @ferreiro ?
Ironic, right?
Description of change
With the default configuration that
mailbox.js
had for the domain threshold bothjs-levensthein
andsift4
, the fuzzy matching algo was not able to correctly show the right suggestions. This PR solves that by using Sift3, which out of the box with 2, 2, 2 for threshold is able to detect string.๐ Also removing an external dependency, making this package 0-dependencies :)
Problem:
Showing an example on why
js-levensthein
andsift4
created a regression on mailcheck.jsAs explained above, replacing sift3 with sift4 will break the 90k users that are already using mailcheck.js, this is because those algorithms have a different output for computing the distance than sift3. In the video I explained how and why.
https://www.loom.com/share/aeb067878f8c43aabbbf904402a53689
Pull-Request Checklist
main
branchnpm run lint
passes with this changenpm run test
passes with this changeFixes #0000