cantino / mcfly

Fly through your shell history. Great Scott!
MIT License
6.78k stars 177 forks source link

Option to priorize exact matches over fuzzy ones #183

Open yangm97 opened 2 years ago

cantino commented 2 years ago

I assume you mean when MCFLY_FUZZY is enabled?

yangm97 commented 2 years ago

yes

cantino commented 2 years ago

Seems reasonable.

nedbat commented 2 years ago

I like fuzzy matching so that I don't have to remember the exact punctuation in a file name. But I'm finding that using it completely removes the claimed benefits of McFly's intelligence:

image

The command I want is the one I ran 10 minutes ago, at position 6. The top choice isn't even the best fuzzy match for the words I've typed. Am I doing something wrong? I have these settings:

$ env | grep MCFLY
MCFLY_FUZZY=true
MCFLY_RESULTS=30
MCFLY_HISTORY_LIMIT=10000
MCFLY_SESSION_ID=UzHARz6EfjOqhxy8VIvT9Bcg
MCFLY_HISTORY=/var/folders/10/4sn2sk3j2mg5m116f08_367m0000gq/T/mcfly.XXXXXXXX.EbZr6lxT
cantino commented 2 years ago

I don't personally use fuzzy matching because I agree that it's of lower quality.

nedbat commented 2 years ago

Is there some way to improve it? It seems a shame to offer a setting which seems to negate the primary claim of the tool (intelligent history).

cantino commented 2 years ago

I'd be open to contributions that improve it. It was contributed by a user and isn't a feature I use myself. I prefer the non-fuzzy matching for how I tend to use mcfly.

dmfay commented 2 years ago

@nedbat matches are weighted by length, per https://github.com/cantino/mcfly/pull/103#issuecomment-720139246

Having been using it for a while myself I agree the balance could stand to shift further towards shorter matches. Easiest tweak is to add a FUZZY_FACTOR to that weighting algorithm -- even better if it's configurable so many people can try out different factors and speed up the process of converging on a generally useful default.

dmfay commented 2 years ago

With 0.5.10 just out the fuzzy experience should be dramatically improved. If you have MCFLY_FUZZY=true you'll start with a "fuzzy factor" of 2, or you can set the environment variable to another integer value. Higher values of MCFLY_FUZZY favor shorter and earlier matches; 0 turns it off.

In my testing a fuzzy factor of 1 didn't do quite enough to prioritize what I was searching for, and 10+ weighed brevity and start position too heavily over the built-in rank. As I mentioned in the readme I expect the best results to be in the 2-5 range, but I also only have my own history to test with. If you have the time to try a few different settings please report how it works out for you and what MCFLY_FUZZY value you settle on!

alfonz19 commented 1 year ago

Maybe I'm not getting it, but for me fuzzy search does not work, at least as I'd expect it to work. Having set export MCFLY_FUZZY=2 (same for 1), I press ctrl-r in bash and type "" and it finds something but first command in list does not contain at all. Shouldn't it contain it?

I used fzf command for searching history file before, and I like their syntax and it might work here as well? Space is delimiter and each entered word has to be present in line, with possibility to negate it using !word, your searches can be made case sensitive/insensitive, etc.

dmfay commented 1 year ago

I thought it might have something to do with quotes, but no, that seems to work:

1661864453

double-check env | grep -i mcfly ?

alfonz19 commented 1 year ago

double checked: image

It was just my bad expectation.

explanation: If i have line: "Pretty horse finished last", then following words will match: "pest", "p e st" or "p e st". So it probably means when fuzzy search is on, we can think of entered search string as regular expression, where there is ".*" automatically added after each char. It's fine, yet I'd say that this blind fishing is less beneficial than interpreting it like: there is some command with 'pest' word in it. Sure, in trivial stuff it does not matter. And if I'm searching for 2 words, I can use search string word1%word2, but in that case I have to know the correct order of words in command. I'd say, in case of long commands with lots of options, the longer the command is, the less useful single arbitrary character match will be useful, the less problematic will be remembering the correct order or words to search for. In this specific case it would be more beneficial to interpret entered search pattern: "word1 word2 word3" as: "search for lines containing all these 3 words in any order." Maybe this is already covered, sorry about this comment in that case, I just thought it is done via this option.

dmfay commented 1 year ago

So it probably means when fuzzy search is on, we can think of entered search string as regular expression, where there is ".*" automatically added after each char.

not quite: it also prioritizes shorter and earlier matches (higher MCFLY_FUZZY numbers make this more important relative to the other ranking criteria). Your example search pest runs the length of the entire string because the only t after an s is in the word "last" and so it doesn't do very well in match length.

Borrowing your example further, pretty will match very well, and prt very slightly better as it's shorter (but it's more ambiguous, and you might see other results you aren't looking for); hrse will also do alright. finished/fnshd isn't as good a search because it's both later and "wider" -- preto will be much more effective than that, having the same width of 8 and a start position of 0 instead of 14.