lewang / flx

Fuzzy matching for Emacs ... a la Sublime Text.
GNU General Public License v3.0
518 stars 37 forks source link

Recency not given any weightage #70

Open artagnon opened 9 years ago

artagnon commented 9 years ago

So, suppose I have the list of candidates:

.emacs.d
.gdbinit
...

The second I type . with the intent of selecting .gdbinit, the entire match list is repopulated with lots of irrelevant entries.

This makes flx unusable for me, and I think fixing it is just a matter of giving recency a weight?

lewang commented 9 years ago

Can you give a more complete list and demo steps?

Why not type "g"? The idea of flx is to smartly sort based on letters as input, but this may be a real problem with the scoring, although I would like to avoid storing recency state.

On Sun, Mar 1, 2015 at 6:05 PM, Ramkumar Ramachandra < notifications@github.com> wrote:

So, suppose I have the list of candidates:

.emacs.d .gdbinit ...

The second I type . with the intent of selecting .gdbinit, the entire match list is repopulated with lots of irrelevant entries.

This makes flx unusable for me, and I think fixing it is just a matter of giving recency a weight?

— Reply to this email directly or view it on GitHub https://github.com/lewang/flx/issues/70.

Le

artagnon commented 9 years ago

Sorry for the terrible bug report; hopefully, this is a better one. So, suppose I have a directory with these files:

.abin
.aboo
.azoo
abin
aboo
azoo
bcin
foo
zin

If they have corresponding open buffers, and I ido-switch-buffer, it works as expected using recency correctly. If I do ido-find-file, I get the candidates bcin abin aboo azoo foo zin .abin .aboo .azoo (what is this ordering based on?). Now, if I type "a", with the intent of selecting abin which comes earlier in the list, it filters to azoo aboo abin ... -- why? There should be no reason to score "azoo" and "aboo" higher than "abin" given that they all start with "a", and I've typed "a" to filter.

Compare this experience with plain ido: I get the candidates abin aboo azoo ... as expected. helm does something similar, and doesn't suffer from the problem.

To summarize, my main problem with flx is not being able to quickly select what I can see: when I type in something to filter, the whole candidate list changes, and I don't know what I was looking for anymore.

scottjad commented 9 years ago

Let me know what you think of this idea:

The index of each item in the initial choices list could be used to set an initial score for each item that would reflect recency for commands that support that and other sorting for commands that have that.

At first I thought of just assigning an integer (- flx-ido-threshold index) but one problem with that is that for long lists that would overwelm all other scoring. So I though instead we could use a decimal, so that the sorting would only matter if the candidates had otherwise equal scores, and a multiple could be used if one wanted to increase the influence of initial sorting. I think candidates are most likely to have equal scores on common prefix matches or with only a few characters of fuzzy matching.

It would look something like this:

flx-ido-index-multiple = 1
initial choices index and initial score
1 = 0.6000
2 = 0.5999
3 = 0.5998

Perhaps one would want sorting to be more relevant, and have it take priority over minor scoring differences. One could increase the multiple like this:

flx-ido-index-multiple = 10
initial choices index and initial score
1 = (* 10 .6000) 6.0
2 = (* 10 0.5999) 5.999
3 = (* 10 0.5998) 5.998

I don't know what the performance or memory impact switching from an integer to float for the score might have.

scottjad commented 9 years ago

On second thought maybe flx-ido-threshold wouldn't be used. I don't know when the initial score should be set, whether it would happen on the initial list or only after the first character was inputted. I also don't know how available and efficient the index and choices list length are when the scores are being first calculated.

PythonNut commented 8 years ago

I definitely feel like a recency weight would be helpful. Recency could be thrown into helm-adaptive, company-statistics, minibuffer histories, etc.

Right, now I'm thinking of constant/log(n) as the weight, but that's just a random theoretical idea. I'll look into frecency, which seems like a better idea in practice.

PythonNut commented 8 years ago

I'm starting to lean on summing constant/sqrt(n) for each candidate. I think it yields a good balance between recency and favoring often-used candidates, even if they're less recent.

martinxyz commented 8 years ago

I think there is a bug here unrelated to the score. To reproduce, create the files aaa1, aaa2, aaa3, aaa4, aaa5 and then use ido-find-file. In the unfiltered list they are sorted one way, but when you press "a" they are suddenly sorted in reverse. I checked the scores, they are all identical.

martinxyz commented 7 years ago

I have found a fix for the problem from my previous comment (patch: https://github.com/martinxyz/flx/commit/35b90358). But I was groping in the dark, changing sorting orders in until it worked, without understanding the original intent, so it may not be the best fix.

It should at least partially fix the original report. For actually sorting files by "recency" I use my own file list hooks: https://github.com/martinxyz/config/blob/574ffb/elisp/emacs.el#L560 (read the emacs wiki link too before you use it, it has different variants of that code).

martinxyz commented 7 years ago

After a bit more testing, while my patch seems to fix ido-find-file it breaks ido-switch-buffer... that's what you get for hacking blindly, I guess.