phyllisstein / alp

A Python module for Alfred v2 workflows
174 stars 11 forks source link

fuzzy matching #8

Closed jlegewie closed 11 years ago

jlegewie commented 11 years ago

Hi,

so here is my shot at fuzzy search. I hope I don`t embarrass myself... :)

The idea is similar to what you know from the quick panel in ST. Here is an example: The queries nor20, n2013, NordAmid, and norld3frs all match the string Nordland 2013 - Amid Fears of Releases. There are three criteria for a match:

Matches are ranked based on two criteria:

The import function is fuzzy_search, which takes two required and three optional arguments.

fuzzy_search returns a ranked list of elements that matches the query.

key has to be specified if elements is not a list of strings and key(elements[i]) has to return a string for every element in the list elements.

I am sure this can be optimized performance wise but it's pretty fast in my tests with a list of over 2000 elements and seq=3. There is also an small example at the bottom of the code (commented out). By they way you can also use this to directly filter a list of feedback dictionaries with key = lambda x: '%s - %s' % (x['title'], x['subtitle'] (in this case the search would be based on a string 'title - subtitle')

Another thing is that you have to feedback a random uid to preserve the ranking. I think it would be great to add an option random to alp.feedback, which assigns uses a random uid and set this option to False by default. Here are details about this.

Let me know if you have questions!

phyllisstein commented 11 years ago

As far as I can tell, this looks terrific. Though some of it is Greek to me, I'll take it on your word. Thank you so much; this is a fantastic contribution, and something that, after my failure, I thought we'd be stuck without. I'll update the docs as well to reflect your changes.