lewang / flx

Fuzzy matching for Emacs ... a la Sublime Text.
GNU General Public License v3.0
518 stars 37 forks source link

Corner cases for Heuristics #81

Closed EphramPerdition closed 8 years ago

EphramPerdition commented 8 years ago

Corner cases seen while tweaking helm-fuzzier, for your consideration in tweaking the heuristics in flx. Probably More to come (There was a second example, but I've misplaced the note)

lewang commented 8 years ago

This is expected. It's my chosen balance between abbreviation and substring matches. def-filter is scored higher than deft. However, if you continue typing "t", it should re-order.

lewang commented 8 years ago

The idea is if the current sort isn't exactly what you expect, continue typing along the lines you're thinking and your intentional target should continue to move up.

EphramPerdition commented 8 years ago

I know flx weights word beginnings heavily, I'm saying that breaks down for single word candidates. You don't think deft is a "better" match for def than for example describe-font?

PythonNut commented 8 years ago

@EphramPerdition in this case, it's not clear to me that def is a better match for deft than it is for describe-font. At least in my own usage, if I typed that, it could be either.

Granted, I have a pretty good understanding of the heuristics, so I generally craft queries to get what I want.

However, there's at least one theoretical matter that's interesting. Le is correct in that fuzzy queries can always be extended if the matches are unsatisfactory. However, describe-font has more room to do so than deft does. Does that make deft a better candidate? Maybe. I don't know.

lewang commented 8 years ago

You don't think deft is a "better" match for 'def' than for example describe-font?

I don't. Otherwise I would give contiguity a higher score :) .

In this case, if you press "t" after "def", it immediately scores higher due to being an exact match.

EphramPerdition commented 8 years ago

I don't. Otherwise I would give continuity a higher score :) .

:) Heuristics are just another word for "personal taste". NB It's not precisely boosting contiguity I'm advocating but special casing for single word candidates.

Personally, I read "more room" as "less specific" and consider decipher deft and debug far better matches for de then diary-insert-entry diary-insert-cyclic-entry diary-yearly-entry diary-insert-block-entry and the other 70 or so entries which flx scores higher.

Of course that's not a "universal truth", just a matter of taste.

Here's how I modified flx-score to reflect my preference:

(setq flx-word-separators '(?- ?\ ?_ ?: ?. ?/ ?\\)) ; - needs to be first, it ends up inside a regex []
(setq flx-word-separators-string (concat flx-word-separators))

(defun flx-score (str query &optional cache)
<...>
            ;; This is the computed score, adjusted to boost the scores
            ;; of exact matches.
            (if (or
                 (string-match
                  (format "\\(^%s[^%s]*$\\)" query flx-word-separators-string)
                  str)
                 (and full-match-boost
                      (=  (length (caar optimal-match))
                          (length str))))
                (+ (cl-cadar optimal-match) 10000)
              (cl-cadar optimal-match))
<...>

Now when I enter de the matches decipher debug and deft appear as the first 3 hits instead of appearing on the 3rd page of results (this is with the latest helm-fuzzier of course).

oscarfv commented 8 years ago

in this case, it's not clear to me that def is a better match for deft than it is for describe-font. At least in my own usage, if I typed that, it could be either.

+1. It could be either, but I prefer describe-font as the first candidate.

The heuristics require some time to get accustomed to, but then you know what to type to arrive at the desired result quicker. There is no "always perfect" solution, though.