Closed xeruf closed 3 years ago
Mine gives dragon
as the top match. You have something in your $FZF_DEFAULT_OPTS
?
Sorry, you are right:
printf 'fire-dragon 1\ndragon 2' | fzf --tiebreak=end --height=2
even though I can now understand why the algorithm selects it, it is not intuitive. The tiebreak end is there mainly for filenames, but it shouldn't prefer a longer entry because of that. Unfortunately there is no option to weigh two tiebreakers (end and length), they can only be ordered.
--tiebreak
is unintuitive and hard to understand especially with the new algorithm so I don't recommend relying on it. I even considered removing it but I kept it mostly for backward compatibility.--tiebreak
option. The default option (length,index
) is actually quite good for file paths..
at the end of your query. `abc.
# Compare "the-end" vs. "the-end."
fzf --height 10 --query 'the-end.' << EOF
the-end-is-near/foo-bar.sh
foo-bar/the-end-is-near.sh
EOF
My point here is that I use fzf for paths in the overwhelming majority, and in that case it makes sense to use end
rather than length
as the tiebreaker (short paths such as under /etc often aren't what I'm looking for when searching for example for an rc file), that's why I set that as default option.
My fix here was to reverse the tiebreaks in that particular script.
That tip does indeed sound nice, though when searching through rc files is not particularly useful, as they rarely have an extension by convention.
The question remains whether the end
-tiebreaker may be tweaked in a way that it yields the more expected output. I guess currently it goes by percentage or position from beginning, when it might just be better to count from behind, where in this example both are equal and thus the length
-tiebreaker would select the correct option ;)
when it might just be better to count from behind
We did that, and it didn't turn out great.
https://github.com/junegunn/fzf/commit/0541c0dbcf96ff40bd80cb8359191dc0fa01d83d
Sadly, there is no one-size-fits-all option. It's better to understand how the default ranking algorithm works and tweak the query string to assist this dumb filter to do a better job. fzf implements all major Readline key bindings so it's easy to move the cursor around and tweak the search terms once you get used to it.
> foo bar┃
# ALT-B
> foo ┃bar
# CTRL-U
> ┃bar
# Prepend /.
> /.┃bar
# CTRL-E
> /.bar┃
# CTRL-Y
> /.barfoo ┃
printf 'fire-dragon\ndragon' | fzf --tiebreak=end --height=2
Just realized that even in this case entering "dragon" will select "fire-dragon", so I can't enter anything more to get a better search result.
The behavior is the same with a space:
printf 'fire dragon\ndragon' | fzf --tiebreak=end --height=2
But removing the delimiter sorts "dragon" first as soon as I enter a single 'd':
printf 'firedragon\ndragon' | fzf --tiebreak=end --height=2
So I'm assuming you are splitting on non-word characters and the matching against the terms. Let me suggest some tweaks to the algorithm, even before the tiebreakers come into play:
We did that, and it didn't turn out great.
What do you mean? Did you revert it?
I can't enter anything more to get a better search result.
dragon ^d
dragon !f
Let me suggest some tweaks to the algorithm
Thanks for the input, but I'm not interested in fine-tuning the scoring algorithm because like I said above, fzf is a text filter and the criteria may not make sense with different types of input.
What do you mean? Did you revert it?
Yes, please read the commit message.
Let me suggest some tweaks to the algorithm
Thanks for the input, but I'm not interested in fine-tuning the scoring algorithm because like I said above, fzf is a text filter and the criteria may not make sense with different types of input.
I thought these changes would be pretty general.
If you don't want to do more tweaking, I would suggest to at least pick the issue title back up and always prefer a perfect match.
The percentage-match could also be an additional tiebreaker.
What do you mean? Did you revert it?
Yes, please read the commit message.
oh woops, I read it the other way around, sorry ^^
I am still interested in looking into improving this - look at this:
printf 'fire-dragonflame\ndragon' | fzf --tiebreak=index --height=2
I am wondering why the algorithm has to resort to the tiebreaker here at all? Very similar to my my real-world issue in #2537.
Because, unfortunately, the current scoring algorithm doesn't have a mechanism to prefer the latter. The scores calculated by the algorithm are just the same for the two cases.
Not saying there isn't room for improvement. We could try fine-tuning the algorithm.
How about taking inspiration from https://github.com/jhawthorn/fzy/blob/master/ALGORITHM.md#fzys-scoring ? I love the scoring in fzy but the features of fzf.
Very similar to my my real-world issue in #2537
So your real problem here is that the default length
tiebreak is not taking --nth
into account (see). If you use the default length
tiebreak, fzf will prefer dragon
over fire-dragonflame
because it's shorter. In the context of fuzzy matching, "the perfect match" is always the shortest one, so with length
tiebreak, we can say that fzf already prefers the perfect match.
It looks like the algorithm of fzy isn't very different from that of fzf. The document you linked above is severely outdated and it describes the algorithm of fzf 0.14.0 and below. fzy doesn't have --nth
or --tiebreak
, so the situation isn't too different if you use the default options of fzf.
But we could consider adding a bonus point to a character at the end of a word so for a query dragon
, dragon light
is preferred over dragonflight
(before --tiebreak
)
If I'm not mistaken, fzy isn't doing that either.
man fzf
)Info
Problem / Steps to reproduce
I would expect number 2 to come out first here. My hunch is that space and minus are treated equally - whereas I think a space (or tab for that matter) is a much stronger word delimiter than a minus.
I encountered this during package search, where there is some extra info before and after the package name delimited with tabs, and thus a perfect match doesn't seem so perfect to fzf.