Osmose / advanced-open-file

Open files and folders in Atom easily.
https://atom.io/packages/advanced-open-file
Other
118 stars 20 forks source link

Leverage fuzzaldrin like fuzzy-finder #57

Closed artagnon closed 8 years ago

artagnon commented 8 years ago

Often, I'm in the situation where the directory listing looks like:

...
0092-name-lowering.txt
0093-global-value-numbering.txt
0094-loop-fission.txt
...

I have to switch to my terminal, and atom *global-value-numbering*. We can leverage fuzzaldrin quite easily to ease the pain.

jeffgran commented 8 years ago

:+1: would love to see an option for turning on fuzzy matching instead of prefix-only autocomplete

Osmose commented 8 years ago

I like this idea and want to add it (and I have a bare-bones commit for it), but I'm not sure yet what we should do about autocomplete.

Since fuzzy matching can easily lead to several paths that don't match at all, what should happen when autocomplete is attempted with fuzzy matching on?

  1. Remain prefix-only and simply beep if a common prefix cannot be found.
  2. Select the top match and add it to the path.
  3. Something else?
artagnon commented 8 years ago

I think (2) is the way to go.

Edit: I'll start testing your commit tomorrow.

jeffgran commented 8 years ago

I think you'll probably find people who'll want one and some who'll want the other, and then option 3 will be to have tab cycle through the possible completions, then hit return to select that one.

That being said, my preference is also (2).

artagnon commented 8 years ago

Okay, so it's pretty awesome except for two details: the one we're discussing (TAB completion) and the need to sort matches using proper weights (which I'm sure fuzzaldrin provides?), not just alphabetically.

This is probably the biggest enhancement we'll get with so little code.

jeffgran commented 8 years ago

I haven't tried this yet, but when I implemented a feature just like this in emacs, I found that simply sorting by length (shortest first) worked the best. That way you can (almost always) type more letters to get the one you want farther down the list to come up to the top. If the shortest one is farther down, there's nothing you can do to bring it up to the top so you have to manually go down and select it.

Osmose commented 8 years ago

Sorting by weight is absolutely possible, and in fact already happens. I just forgot to disable the alphabetical resorting when fuzzy search is enabled.

jesseleite commented 8 years ago

Keep in mind (if you haven't already), the new fuzzy-finder scoring algorithm. It's available in Atom master stream as experimental option, and has a much better scoring algorithm than default scoring algorithm (feels much more like fuzzy finder in Sublime Text does).

alternate fuzzy finder scoring

Not sure how this affects adding fuzzaldrin to advanced-open-file here. Just wanted to note that this opt-in scoring algorithm might become default in a future release.

Osmose commented 8 years ago

I remember hearing about that but did not actually have it in mind, thanks for the note!

Also I'm totally just waiting for the right moment to work on this, I'm not at all constantly failing or forgetting to spend time on this. NOT AT ALL

artagnon commented 8 years ago

Thanks, I just checked: they're using fuzzaldrin-plus in the case of alternate scoring. A simple matter of replacing import fuzzaldrin from 'fuzzaldrin' with import fuzzaldrinPlus from 'fuzzaldrin-plus'.

jesseleite commented 8 years ago

Cool. If that's the case, then if fuzzaldrin-plus becomes the new fuzzaldrin (if the new algorithm becomes default instead of experimental), then you'd need to update that dependency back to fuzzaldrin at that time. No biggie. As long as everyone is aware.

jeancroy commented 8 years ago
  1. Remain prefix-only and simply beep if a common prefix cannot be found.
  2. Select the top match and add it to the path.

Autocomplete-plus first ensure first char match, then delegate to fuzzaldrin for ordering. Fuzzy-finder on the other hand simply use fuzzaldrin.

Osmose commented 8 years ago

This has landed in master! Feedback is welcome, I'll probably try to push a new release sometime soon with the change.

artagnon commented 8 years ago

I found this rough heuristic:

When number-of-matches/number-of-characters-to-match > 1500, fuzzaldrin-plus breaks down and becomes quite unresponsive. It's unlikely that we'll be looking at directories with > 1500 files, but the solution is worth noting: Until that threshold, I do prefix-matching, and only switch to fuzziness below it; it works quite well in practice.

jeancroy commented 8 years ago

@artagnon do you have an example directory listing that trigger this behavior ? There's nothing in fuzzaldrin-plus that grow >linear with number of file (except say the final sort)

There is however a m_n worst case complexity with the number of character (length of query_length of path)

It is however possible that the total proceeding time is too slow for interactive time. It is recommanded to gate the query with a debounce step.

artagnon commented 8 years ago

@jeancroy So you're saying it should be original-number-of-candidates times number-of-characters-to-match? Wait, every time number-of-characters increases by 1, aren't the matches a subset of the previous step? Or you need to run the fuzzy algorithm on the whole candidate-set again?

As for an example, try running megafinder on llvm. It uses the magic 1500 ratio I proposed.

Also, I'm confused about your previous comment on fuzzy-finder. Am I not reading this right?

jeancroy commented 8 years ago

Yeah max complexity is pretty much ( query length ) * (avg path length ) * (number of file) However non-match exit fast. And exact match (indexOf) also exist much faster.

Yes, when you ask the question yes/no belong to the set , the match are a subset of previous match, if only difference is append to the right. I'd be correct for you to feed back result into candidate list. Delete and insertion from middle will break this.

However that question is answered really fast. About a single read on the path length. What take time is affinity for consecutive, start of word and splitting the user query.

Say understand that acon means application_controller and not application_controller

What kind of delay are you experiencing ? Benchmark clock about 300ms on 60K files and a 5 char query, described by their full path, quite deep (node_module).


Also, I'm confused about your previous comment on fuzzy-finder. Am I not reading this right?

Oh fuzzaldrin.match is not ment to be used on a large number of item. It is for displaying the best matches detail (user interface). Fuzzaldrin.filter is intended for sorting and selecting.