krisk / Fuse

Lightweight fuzzy-search, in JavaScript
https://fusejs.io/
Apache License 2.0
18.16k stars 767 forks source link

ability to limit the number of results #56

Closed derhuerst closed 8 years ago

derhuerst commented 9 years ago

I'd like to implement an autocompletion service for public transport stations. Since no user is going to read through more than 5 autocompletions, I want to limit them. The benefit would be a faster search.

AdamGaskins commented 9 years ago

Can you just break after iterating through the display loop 5 times?

derhuerst commented 9 years ago

My problem is that many more results get computed. My problem is not about displaying the results.

AdamGaskins commented 9 years ago

My understanding of the Bitap algorithm is that it has to process every item to calculate a score for it. The sorting can only happen after this.

For example if you search for egg on this list:

["rEfriGeratorG", "salad", "EGGplant"]

And then cancel the search after it reaches the second item, it will return refrigeratorG as the best match, when if you let it finish, eggplant obviously would have been better.

You could still add a feature that limits the amount of results it spits out, but that's just the same as doing:

var results = fuse.search("stuff");
for(var i = 0; i < results.length; i++) {
    // do stuff to display result

    if(i > 4) break;
}
derhuerst commented 9 years ago

No, you could make the sorting itself more efficient. What I wrote today (for a different purpose) can be used for this as well: hifo.

jeancroy commented 9 years ago

you can do a top-k sorting but I believe most of the time is spent computing score. As AtmoEntity pointed out there's no way out of scoring everything to guarantee showing the best result.

krisk commented 9 years ago

@derhuerst, I am not following. As it has been mentioned, the nature of Bitap is such that you'd have to pre-compute the score of the items against the search term before any sorting can take place. Ergo, I cannot see how Fuse would yield the top n items out of m without having calculated m number of scores.