javve / list.js

The perfect library for adding search, sort, filters and flexibility to tables, lists and various HTML elements. Built to be invisible and work on existing HTML.
https://listjs.com
MIT License
11.18k stars 897 forks source link

Feature Request: Fuzzy Search (DONE) #57

Closed LuukvE closed 12 years ago

LuukvE commented 12 years ago

Dotdreaming edited the List wiki asking for Fuzzy Search to be added. Fuzzy search is a feature which allows users to make mistakes when typing their search query. The application will still find close matches. You can tweak the threshold by adding fuzzyThreshold to your list options.

I used code provided by the Diff Match Patch project at http://code.google.com/p/google-diff-match-patch/

You can see a live demo of this commit at bijtel.com/Sociale-Kaart. However, that app sorts on Organisation name & keywords. Keywords are only viewable when you click on a record. (don't get confused ^^)

javve commented 12 years ago

Wow, this looks really good! I'll take a deeper look into the code and then get back to you @LuukvE !

LuukvE commented 12 years ago

One problem with fuzzySearch, at this moment, is that it does not sort the list on relevance. Which is quite important, especially when many records are found.

Another option is to only use fuzzysearch when no other records are found. A "notFound" event could also be added to the list object, in order to make developers able to notify the app users.

javve commented 12 years ago

Would it be possible to make the fuzzy search to a plugin?

And one more extra feature that would be awesome is that it could be possible to match a search over many columns at once? (I guess its just to concatenate the values and search). The reason for this request is the case where you have items looking like this

var item = {
   firstName: "Jonny",
   lastName: "Stromberg"
};

And then searches for "Jonny St" eg.

LuukvE commented 12 years ago

I added a commit which (optionally) splits the searchstring into arguments and iterates over all selected collumns. But this approach is I think not performant enough. I added a debounce to the search method to improve performance.

I agree, it is best to move all of this to a plugin, the plugin will probably just overwrite the default search method and perhaps do some pre-processing on all values in the list. A more advanced feature would be to sort on relevance.

joeybaker commented 12 years ago

Thanks @LuukvE – this is great!

I hadn't heard of the google diff-match tool before so I did some quick research: It uses the bitap algorithm which implements Levenshtein distance. I'm not a expert on this field, but from having played with some of these before, seems to me that this should be both speedy and pretty accurate.

Thanks!

LuukvE commented 12 years ago

Yes, my search actually went the other way around, first finding the Levenshtein distance and then the Bitap algorithm, and finally the Diff-match-patch project by Neil Fraser.

I ran some tests and the fuzzySearch method seems to be quicker than a regular expression search (using Chrome). Besides that we can use the Bitap score to sort the list based on relevance, which will make searching tremendously better. At this moment, especially with large (1000+) recordsets and short search strings the algoritm gets false-positives. Those should be lower in the list. Just like going to page 1000 of any Google search will only return nonsense.

joeybaker commented 12 years ago

sweet. I'm using list.js with datasets that can potentially load 1000s of records and hadn't gotten around to fixing the search problem yet. You've saved me a ton of time – thanks!

I think I'd vote with @javve – this might be better as a plugin. I can see a good use case for not wanting search to be sorted by relevancy and/or the extra power isn't needed.