wdullaer / scotty

Transports you to any directory you have visited before
Mozilla Public License 2.0
18 stars 1 forks source link

Disambiguation on near equal values #6

Open jbiason opened 4 years ago

jbiason commented 4 years ago

(I understand this is not a simple issue, but maybe discussing it would provide some ideas on how to solve it.)

I have a couple of directories that are almost the same, but with some differences, for example:

While using Scotty with s v3 will get me in the expected directory, s work will jump to the last of those that I got in, with no change to make it to the other (e.g., if the last I got into worker-v3, s work will jump back to worker-v3, no matter how many times I call Scotty).

One thing that I can think to solve this is to add a parameter called --exclude in the Scotty command, and pass the current directory in it (e.g., if I'm in worker, calling s work will actually call scotty search work -- exclude "worker").

While this could work for two directories, if I had a worker-v2 in there, Scotty would keep jumping between some two of them and never get to the third. For this, --exclude could move the directory to the bottom of the list, while search will get the first one; this will make the directories "rotate" through all the possibilities.

Honestly, I'm not sure if any of those is actually implementable, as I didn't read Scotty code yet (long hours in the office, a non-healed burnout).

If this isn't implementable, feel free to close the issue.

wdullaer commented 4 years ago

First off: get well soon! Burnout is not to be underestimated, I hope you can give yourself some rest and find your energy again. Feedback like this is as good a contribution as any piece of code :)

The problem you're describing cannot be perfectly fixed. It's a fuzzy matcher, the results will never match your expectation 100% of the time, but we can try to get close.

Scotty searches as follows:

  1. Use fst to find a list of matches in the index
  2. Score each result for relevance and return the item with the highest score
  3. Use the last visited timestamp as a tiebreaker

The fst part of this seems to be working fine: whenever I get a result I didn't expect, the result I was expecting is part of set returned by fst. The scoring is the part that will need a bit of tuning: there are a large number of algorithms for scoring search results and each one of them typically has a large number of tunable parameters. For example: we want to score results with more consecutive matches higher. We also want to score results that match near the end of the string higher. Scoring in Scotty is a pure function over the index entry and the search pattern. How often you visited the folder, or when you visited the folder does not factor into it. Most other comparable tools keep a score that increases each time you visit a folder and use that to sort for relevance. I didn't like this approach because they typically do not offer a way to decrease this score automatically. So a popular folder from a few years ago, will still dominate newer results, unless you manually decrease its score.

Your example is slightly pathological, in the sense that the results are equal except for a -v2 and -v3 suffix. If you run s work I would expect Scotty to take you to worker since that's the match closest to the end. worker-v2 should be reachable by using workv2 or even w2 (depending on what else you have in your index). I want to experiment with different scoring algorithms in the future. I might even make them configurable, since the "right" one probably depends on taste and what kind of queries you tend to run.

I like your idea of excluding the current directory from the search results. This makes a lot of sense: if you're invoking scotty your intent is to move directories. That should not be too hard to implement.

Other ideas are to have a function that only searches subfolders of the current directory.

wdullaer commented 4 years ago

In v0.3.0 the current directory will be excluded from the search results, which should improve your experience.

I'm planning on experimenting with different scoring algorithms to see if I can get this to behave better still.