Open corbanbrook opened 10 years ago
@corbanbrook AFAIK this issue should be filed against the fuzzaldrin library which is used by fuzzy-finder to filter and score results.
fuzzaldrin simply does scoring and sorting of arrays of strings or objects. Some of my above recommendations might be outside the scope of the project.
One solution would be for the fuzzaldrin to add option for custom filter/sorting callbacks. Another solution would be to simply use fuzzaldrin to provide the initial score to use in further sorting schemes within this project like dot file priority, ignored file priority, and last modified priority.
Would also be nice if the fuzzy finder could have the results filtered in terms of importance for type of project. For instance, a Ruby on Rails project, if I start typing a model's name, have the first result usually be '/app/models/model_name.rb', instead of having the first result be 'spec/models/model_name_spec.rb'.
Most times I want to deal with the model, not the spec.
It would be nice as well to have more recency to the find logic, although it will mainly (it does seem intermittent especially of you switch to another app and back again) suggest the last file accessed to allow quick switching between files, it would be good if it always gave precedence on the file based on last access allowing to easily work between several files.
A good example of where this works well is Textmate's implementation of cmd-t find file. The sorting there works well.
Here's an example where the order isn't great. The second result is what I want, and it's a much closer match, so I don't know why it's second.
Even worse:
I wanted the last result in this instance.
(I hope these examples are useful, apologies if they're noise)
Another one
Coming from ST3, the fuzzy matcher really drives me crazy that it lists the specs before the actual controllers I want.
Is there any config which changes how the fuzzy finder works, or do we need to improve the underlying fuzzy finding library to improve the searching?
+1 for this
I decided to play with Atom for the first time this weekend; I immediately found myself frustrated with the strange fuzzy ordering in Atom's select list views.
If we're going to improve fuzzy matching in Atom, there are lots of things to consider:
fuzzaldrin
should continue to respect the current scoring tests; these represent the only codified community judgment we have so far. We'll probably want to augment these tests with examples from real-world projects, too.fuzzaldrin
's filter
method, which takes the strange queryHasSlashes
parameter and invokes the specialized scorer.basenameScore
depending. I'd expect any update to filter
(a) will need to be parameterizable by the caller — for example, to indicate separators, weights, etc. and (b) will need sensible defaults so it can be invoked without more than the needle and haystack. As an example, we might want path separators to have importance when invoking filter
with a list of file names, but we might want the colon-space (:
) to have importance when invoking filter
from the command palette.matrix
but lacks essentially any useful comments.) Command-T also has a well-liked algorithm. Gary Bernhardt's selecta ranking algorithm was based on some interesting discussion that considered this prior art.fuzzy-finder
's case, there's a lot of metadata we can and probably should use to improve ranking. The venerable PeepOpen ranking algorithm takes into account file modification times, last opened, git status, etc. Probably this more sophisticated ranking belongs strictly in fuzzy-finder
, as a new "meta scoring" layer; fuzzaldrin
should continue to just be about ranking a needle in a haystack of strings.smartscore
branch tries to codify some basic intuitions about what makes a match "better". These include: touching the "starts of words" counts for more; some separators are worth more than others (in file contexts, '/' is probably worth more than '-' or ' '); on the whole, we should prefer fewer contiguous runs of longer length; full word matches along the way are always preferable; etc.filter
starts fresh. But it seems to me that (a) it may prove desirable to pre-process each string in the haystack before ever invoking filter
, and (b) if the user is simply appending characters to the query string, it might (?) be possible to iteratively re-score the results.Alright — hopefully this is useful/interesting to someone. I plan to slowly work on improvements to both fuzzaldrin
and fuzzy-finder
in my personal branches. Suggestions and feedback are most welcome!
(For fun, I started by replacing fuzzaldrin
's current score
method with a coffeescript re-implementation of TextMate 2's ranking algorithm; it works and, after a minor tweak, passes all fuzzaldrin
tests.)
:+1: as the current solution is rather useless - and can even be faster to find the file manually
:+1: Would be great if we could get some progress on this.
Improved sorting / scoring: https://github.com/atom/fuzzaldrin/pull/22
The above pull request addresses at least some of the issues raised here.
Since I'm working on a lot of Rails projects with ActiveAdmin, I'm often annoyed when I end up in an ActiveAdmin file for a particular resource instead of a model file.
I was thinking about improving this by sorting the fuzzy-finder results by usage. I.e. if a files in some folder are worked on more often, they are ranked higher.
I'm happy to implement this experimentally and make a pull request if other people approve of this idea also.
:+1: for some improvements that make finding commonly used files easier. sublime seemed to have done a better job putting the file i actually want to open at the top (using rails here as well)
Just in case further examples are helpful:
@jeancroy Did #22 solve this issue?
There's now an "use Alternate Scoring" option in fuzzy finder that use it. It address many issue about the search by file name / path.
But it does not cover any knowledge about the file themselves, such as preference for recent / frequent / certain files.
I just tried the Atom Beta with "Use Alternate Scoring" enabled and it's a huge improvement, though still not as good as Sublime Text. I have a project with a huge number of files, including Doxygen generated html files that I rarely want to look at. I tried to find a file named "matchOptimisticB.h". In SublimeText I can type "mob.h" and get the right file as the first suggestion. In Atom Beta it is the ninth choice, preceded by eight html files I have no interest in.
One thing that might help Atom: if the user provides a file type suffix, prefer names that match that suffix exactly over names that use that suffix as a prefix.
Another thing that might help (though I really hope it won't come to this, and it's not needed by Sublime) is to allow the user to disable directory patterns. In my case I might eliminate searches of Doxygen-generated html files and would definitely elimiate .os files (why in the world is it showing binary libraries?).
Ok please open an issue on fuzzaldrin-plus I can give it a look. I'd need the result that come before to understand why they are preferred. Also full path is useful, if private, a mock-up with same length and directory depth.
On Thu, Dec 3, 2015, 13:51 Russell Owen notifications@github.com wrote:
I just tried the Atom Beta with "Use Alternate Scoring" enabled and it's a huge improvement, though still not as good as Sublime Text. I have a project with a huge number of files, including Doxygen generated html files that I rarely want to look at. I tried to find a file named "matchOptimisticB.h". In SublimeText I can type "mob.h" and get the right file as the first suggestion. In Atom Beta it is the ninth choice, preceded by eight html files I have no interest in.
One thing that might help Atom: if the user provides a file type suffix, prefer names that match that suffix exactly over names that use that suffix as a prefix.
Another thing that might help (though I really hope it won't come to this, and it's not needed by Sublime) is to allow the user to disable directory patterns. In my case I might eliminate searches of Doxygen-generated html files and would definitely elimiate .os files (why in the world is it showing binary libraries?).
— Reply to this email directly or view it on GitHub https://github.com/atom/fuzzy-finder/issues/21#issuecomment-161745601.
I just submitted this issue. I hope it helps.
https://github.com/jeancroy/fuzzaldrin-plus/issues/12
Thank you very much for trying to improve Atom’s fuzzy search.
— Russell
On Dec 3, 2015, at 10:56 AM, Jean Christophe Roy notifications@github.com wrote:
Ok please open an issue on fuzzaldrin-plus I can give it a look. I'd need the result that come before to understand why they are preferred. Also full path is useful, if private, a mock-up with same length and directory depth.
On Thu, Dec 3, 2015, 13:51 Russell Owen notifications@github.com wrote:
I just tried the Atom Beta with "Use Alternate Scoring" enabled and it's a huge improvement, though still not as good as Sublime Text. I have a project with a huge number of files, including Doxygen generated html files that I rarely want to look at. I tried to find a file named "matchOptimisticB.h". In SublimeText I can type "mob.h" and get the right file as the first suggestion. In Atom Beta it is the ninth choice, preceded by eight html files I have no interest in.
One thing that might help Atom: if the user provides a file type suffix, prefer names that match that suffix exactly over names that use that suffix as a prefix.
Another thing that might help (though I really hope it won't come to this, and it's not needed by Sublime) is to allow the user to disable directory patterns. In my case I might eliminate searches of Doxygen-generated html files and would definitely elimiate .os files (why in the world is it showing binary libraries?).
— Reply to this email directly or view it on GitHub https://github.com/atom/fuzzy-finder/issues/21#issuecomment-161745601.
— Reply to this email directly or view it on GitHub.
Does anyone know if there is an equivalent issue open discussing the Cmd-Shift-P search algorithm?
you're speaking of command palette? Should already be integrated. If you have a problem you can try openings a issue on fuzzaldrin-plus repo
On Tue, May 3, 2016, 20:22 Thomas Rich notifications@github.com wrote:
Does anyone know if there is an equivalent issue open discussing the Cmd-Shift-P search algorithm?
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/atom/fuzzy-finder/issues/21#issuecomment-216706212
This is a pretty ancient issue, so I have little hope of improvement arriving any time soon, but here's my two cents:
Two things wrong with the way the fuzzy finder currently works both illustrated with the above example.
1) Sort order (as already pointed out in this issue): based on my input, I would really expect the file member/edit/edit.html to show up on top. It does for some reason when I remove the l
from html
:
But with the full html
it suddenly drops to second place which gets rather infuriating after having opened the wrong file several times.
So scoring should somehow take into account how close the search terms are to each other in the filename, and prioritize edit.html
over edit-payment.html
unless I include payment
in my search query.
2) It should really prioritize full word matches rather than scattered letters. If you look at the above example, it actually matches member
because it's in the app/components/admin/member folder, instead of simply matching to the member
part of the path, because that's a whole matching word.
These two tweaks would make the search algorithm a lot stronger.
hi @adamreisnz , as a curiosity, is this happening with alternate scoring turned on ? There was a strong preference for word "togetherness" in that version.
From screenshot I'm guessing it's not, but if it is I'll add a few of those to test benchmark.
Previous algorithm would take first occurrence of m
then first e
then first m
instead of waiting and trying for member
@jeancroy thanks for looking into it, but yes, it's in fact enabled:
The version I'm using is 1.18.0-dev-f4a83b238
Another possibility is that alternate score is used for ranking while classic is used for highlighting. The whole component below fuzzy finder has been rewritten recently. If that's the case the whole scattered letter is a false trail.
One feature of the new one is a bias toward file name (vs whole path) when we match file extension exactly I think you are batling against that when you are using keyword from the path but end with extensions
To sum up your request, you want the htm behavior to happens even in html case ? I'm not sure what the algorithm does because of how scambled the higligth is.
Well, my use case as you might deduce from my example is that in a large project, there will be many components. Each component might have an edit
sub component as in the example, and each of those components will have edit.html
template, and edit.js
module, and perhaps edit.ctrl.js
controller.
So the way I tend to quickly open the file I want, is by specifying the parent component member
, then the sub component edit
and then extension if I know there's going to be more than one file.
This usually works fine, but in the above case it was messing it up due to the existence of another similar file in the same path (edit-payment.html
).
I think my use case is fairly common, so I wouldn't expect to be "battling" against the fuzzy finder's system with it.
edit.html
should still be preferred over edit-payment.html
if you search for "edit html" imo, on account of it being the shorter and closer match.
You're right on all account, in this case it seems the algorithm just like the m
of payment.
I guess the m
of html manage to count twice, I'll see how to fix that.
Good news is that the issue is more constrained than say lack of prioritizing "full word matches". (Here-
count as a word boundary)
I'll open a different issue for highlight regression it should group member
appropriately
Yeah that looks better in your screenshot, highlighting member properly. And interesting that it likes the m
in payment and paykent
is put at the bottom properly. Looks like it's just a few tweaks needed to fix those issues then 👍
Looks like in the latest version (just built Atom from master yesterday) there's still some scoring issues. For example this result:
It should not prioritise cards/club-details.js
over cards/details.js
for the same reason as above, where it shouldn't prioritise the edit-payment.html
file. cards/details.js
is a closer match, because it has fewer non-matching characters between the matches.
I did not type a c
character and it already matched card
, so it's a bit baffling why it tries to mark the c
of club
and give that result a higher score than the more sensible result below that.
Note that when I type cards
it does prioritise correctly (but still marks the c
in the second result):
I think once a search term has been used/matched in the path, it should not try to match it again for another part of the path. In addition, results with the least amount of non-matching characters between the matches should probably score highest.
Another example in Atom 1.20 dev where prioritisation is not what one would expect;
Guys, any activity on this issue please? It's infuriating to keep opening the wrong files because the fuzzy finder sorting logic is off.
VSCode manages to do it correctly, why not Atom? Perhaps it would be worthwhile looking at their algorithm.
@adamreisnz looks like this was fixed a month ago by @jeancroy but we're running an outdated version of fuzzaldrin-plus. Will create a PR.
Multiple schemes can be employed to achieve results which are most relevant to the user. Predicting which file a user wants out of a long list of possible matches and presenting it first can help speed up development time/maintain flow and train of thought. Here are some ideas to discuss: