krisk / Fuse

Lightweight fuzzy-search, in JavaScript
https://fusejs.io/
Apache License 2.0
18.15k stars 767 forks source link

fix(logical): fix scoring for logical OR #593

Closed adlerfaulkner closed 2 years ago

adlerfaulkner commented 2 years ago

Change match selection in logical OR to select the best scoring (lowest value) match instead of the first match. Currently, a logical OR which fuzzy searches across multiple fields may score incorrectly because it returns the score for the first matching field, not the best.

adlerfaulkner commented 2 years ago

Hello! Loving Fuse's ease of use and maintainability, thanks @krisk!

I noticed this scoring problem when using the solution for searching for multiple tokens across multiple keys by @0xdevalias on #235.

When fuzzy searching for the query wood using a logical OR across title and author.lastName...

{
      $or: [
        { title: 'wood' },
        { 'author.lastName': 'wood' }
      ]
}

...in the data below, one would expect all three results to be equally scored because they all have woodhouse in author.lastName. However, the current scoring method for logical OR only uses the score of the FIRST fuzzy match found from the list of statements in the OR query, not the BEST match. This makes it so that the second item in the list below gets a worse score than the others because Wooster is the first fuzzy match found based on the order of the statements in the OR.

  {
     "title": "Right Ho Jeeves",
     "author": {
        "firstName": "P.D",
        "lastName": "Woodhouse"
     }
  },
  {
     "title": "The Code of the Wooster",
     "author": {
        "firstName": "P.D",
        "lastName": "Woodhouse"
     }
  },
  {
     "title": "Thank You Jeeves",
     "author": {
        "firstName": "P.D",
        "lastName": "Woodhouse"
     }
  }

I have fixed this by choosing the match with the best score from the statements of the logical OR.

Edit: Until @krisk reviews, my fork with this fix in master can be found here https://github.com/comake/Fuse

adlerfaulkner commented 2 years ago

@krisk Updated to incorporate all matches in a logical OR into the score in https://github.com/krisk/Fuse/pull/593/commits/1e420fd37e19c74e217945b5c74334fa3b87127d