krisk / Fuse

Lightweight fuzzy-search, in JavaScript
https://fusejs.io/
Apache License 2.0
17.89k stars 759 forks source link

Add different builds: Full and Basic #385

Closed krisk closed 4 years ago

krisk commented 4 years ago

To save on size, it would be great to have different builds

krisk commented 4 years ago

Added infuse.js@5.2.0-alpha.0.

Full explanation of build files.

cc: @cshoredaniel, @sidvishnoi, @ndelangen - let me know your thoughts. I figured this would be a good option to have.

Rationale:

File Size Savings
fuse.js 52.15kb -
fuse.basic.js 31.64kb 40%
fuse.min.js 18.01kb (gzipped: 5.80kb) -
fuse.basic.min.js 10.37kb (gzipped: 4.00kb) 42% (gzippped: 31%)
fuse.common.js 49.05kb -
fuse.basic.common.js 29.71kb 39%
fuse.esm.js 33.25kb -
fuse.basic.esm.js 22.49kb 32%
fuse.esm.min.js 12.25kb (gzipped: 4.01kb) -
fuse.basic.esm.min.js 7.86kb (gzipped: 3.09kb) 36% (gzippped: 23%)
danielfdickinson commented 4 years ago

I like the idea. In the future I wonder if it'd be possible to make the Byteap hack I pointed you at sufficiently performant (I haven't looked deeply so it's just a thought for now) handle all lengths of string that Fuse wishes to support. It might not be possible from an algorithmic view but I'd rather like the ability to have fuzzy short and long strings without extended search included.

krisk commented 4 years ago

@cshoredaniel indeed, I tested the Byteap solution you provided (which, btw - I forgot to thank you for - so, thank you!) and I'm still trying to get adequate performance out of it. The 2D array computation is the bottleneck.

Curious to know your thoughts: For very long patterns, it would seem that there would be a far smaller error-to-pattern-length ratio than for smaller patterns. That is, for longer patterns, there is "more to work with that is spelt accurately". Because of this, I went the ngram route, and it does seem to provide pretty good results.

...but I'd rather like the ability to have fuzzy short and long strings without extended search included.

Agreed. Goal is to get byteap to be close to bitap on performance.

danielfdickinson commented 4 years ago

Hi, sorry for the long delay. For a longer patterns against the same text I think your observation holds. I suspected the 2D array would be an issue, but hopefully not insurmountable. I see you've been keeping busy and I have some reading to do.