krisk / Fuse

Lightweight fuzzy-search, in JavaScript
https://fusejs.io/
Apache License 2.0
18.16k stars 767 forks source link

False positive matches when pattern.length > 32 #136

Closed mgalgs closed 4 years ago

mgalgs commented 7 years ago

I've noticed that when the length of my search pattern is > 32 fuse matches everything in my dataset. Interestingly, if I "include matches" there aren't actually any indices in the resulting matches array.

Here's a live demo: http://codepen.io/mgalgs/pen/jyROXg

Also pasted here:

document.addEventListener("DOMContentLoaded", function(event) {
  var list = [{
    text: "pizza"
  }, {
    text: "feast"
  }];
  var options = {
    include: ["score", "matches"],
    shouldSort: true,
    threshold: 0.5,
    location: 0,
    distance: 0,
    maxPatternLength: 50,
    minMatchCharLength: 4,
    keys: [
      "text"
    ]
  };
  var body = document.getElementsByTagName('body')[0];
  body.innerHTML = '<h3>Fuse <b>always</b> matches when pattern length > 32</h3>';

  var fuse = new Fuse(list, options); // "list" is the item array
  var patterns = [];
  var i;
  for (i = 0; i < 40; ++i) {
    patterns.push('w'.repeat(i));
  }
  patterns = patterns.concat(["i like pie", "123456789112345678921234567890123", "pizza",
    "feast", "stuff is good", "pizzza", "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"
  ]);
  for (i = 0; i < patterns.length; ++i) {
    var pattern = patterns[i];
    var matched = fuse.search(pattern).length > 0;
    body.innerHTML += 'Pattern (len=' + pattern.length + ') ' + pattern + (matched ? ' matched' : " didn't match") + '<br>';
  }
});

Results in:

Fuse always matches when pattern length > 32

Pattern (len=0) didn't match
Pattern (len=1) w didn't match
Pattern (len=2) ww didn't match
Pattern (len=3) www didn't match
Pattern (len=4) wwww didn't match
Pattern (len=5) wwwww didn't match
Pattern (len=6) wwwwww didn't match
Pattern (len=7) wwwwwww didn't match
Pattern (len=8) wwwwwwww didn't match
Pattern (len=9) wwwwwwwww didn't match
Pattern (len=10) wwwwwwwwww didn't match
Pattern (len=11) wwwwwwwwwww didn't match
Pattern (len=12) wwwwwwwwwwww didn't match
Pattern (len=13) wwwwwwwwwwwww didn't match
Pattern (len=14) wwwwwwwwwwwwww didn't match
Pattern (len=15) wwwwwwwwwwwwwww didn't match
Pattern (len=16) wwwwwwwwwwwwwwww didn't match
Pattern (len=17) wwwwwwwwwwwwwwwww didn't match
Pattern (len=18) wwwwwwwwwwwwwwwwww didn't match
Pattern (len=19) wwwwwwwwwwwwwwwwwww didn't match
Pattern (len=20) wwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=21) wwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=22) wwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=23) wwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=24) wwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=25) wwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=26) wwwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=27) wwwwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=28) wwwwwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=29) wwwwwwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=30) wwwwwwwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=31) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=32) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=33) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww matched
Pattern (len=34) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww matched
Pattern (len=35) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww matched
Pattern (len=36) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww matched
Pattern (len=37) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww matched
Pattern (len=38) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww matched
Pattern (len=39) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww matched
Pattern (len=10) i like pie didn't match
Pattern (len=33) 123456789112345678921234567890123 matched
Pattern (len=5) pizza matched
Pattern (len=5) feast matched
Pattern (len=13) stuff is good didn't match
Pattern (len=6) pizzza matched
Pattern (len=33) bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb matched
cawo4ok commented 6 years ago

@krisk do you have some update regarding that issue? I have the same problem.

acmadden commented 6 years ago

Not sure how you would want to handle this but the issue lies in bitap_search.js in the following line const mask = 1 << (patternLen - 1)

Since JS converts values to int32 for all bitwise operations patternLen > 30 overflows. I'll be looking more into this later on.

var-che commented 6 years ago

I have this same issue here. That much time has passed and the bug is not fixed, or am I missing something?

ErikLarsson82 commented 6 years ago

I'm also waiting on an update from @acmadden

rkurbatov commented 5 years ago

Any update here?

acmadden commented 5 years ago

Unfortunately I never got around to fixing the issue nor do I have the time now. I have laid out what exactly is the problem above, so if you want to submit a PR for a fix I’m sure that would be appreciated.

ErikLarsson82 commented 4 years ago

Hi, i solved this bug so please take a look at the solution. https://github.com/krisk/Fuse/pull/333 Still using Fuse @mgalgs ? ;)

mgalgs commented 4 years ago

Still using Fuse @mgalgs ? ;)

@ErikLarsson82 as a matter of fact I am :smile: Your fix looks promising!

github-actions[bot] commented 4 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

jeffalo commented 4 years ago

This still happens. You can see it happening with 33, 65, 129, 257 letter inputs here http://is.wasteof.money/