EthanRutherford / fast-fuzzy

Fast fuzzy search utility
ISC License
376 stars 8 forks source link

Score is always 1 with useSellers: true as long as candidate starts with or ends with term #26

Closed giladbarnea closed 1 year ago

giladbarnea commented 1 year ago

Hi!

In other words, the score is 100% dictated by the prefix/suffix of the candidate, and whatever's in the middle has 0 significance.

This means that both of the following candidates have the same score -- 1:

ff.search('fast', ['fast__אבג%-@-', 'fastfz'], {returnMatchData:true, useSellers: true})
[
  {
    item: 'fastfz',
    original: 'fastfz',
    key: 'fastfz',
    score: 1,
    match: { index: 0, length: 4 }
  },
  {
    item: 'fast__אבג%-@-',
    original: 'fast__אבג%-@-',
    key: 'fastאבג',
    score: 1,
    match: { index: 0, length: 6 }
  }
]

If this is by design, then my bad, but the intuition goes like:

  1. these 2 candidates are very different from each other
  2. the first one is definitely not as similar to 'fast' as the second one
  3. a score of 1 implies 100% perfect match, and I'd say neither candidate "qualifies" to that

WDYT?

Thanks!

EthanRutherford commented 1 year ago

Yeah, this is intended behavior for the useSellers option. The seller's version of the levenshtein distance finds the best matching substring, so if the whole search term is found somewhere in the string it reports a perfect match. It's a bit confusing for sure that useSellers is true by default, but its a leftover from the original versions of the library. If I were to rewrite the library from scratch, it would definitely be opt-in instead of opt-out. Something for a v2. You can set useSellers to false to get the behavior you're expecting.

giladbarnea commented 1 year ago

Alright, thanks!