Closed Quenty closed 1 year ago
Wow, that's an ugly bug. Thanks for the PR! I will merge it and add a few tests in the next day or two.
@Quenty, do you have a test case that demonstrates the bug?
Below is my current thinking; what am I missing?
For two strings a
and b
, has_match(a, b, true)
must be true before you call score(a, b, true)
. And in that case, a
must be a case-sensitive subsequence of b
, so if they have the same length, they match.
Ah, it occurs if you score directly without calling filter()
. I guess I've been using this library wrong. :P Since all members of the library appear to be public, especially given testing, it's perfectly reasonable to use the score method without calling hasMatch().
Performance implications here are pretty small, so it's worth including this for completeness.
You're right that the API is a bit fishy. There is an invariant,
has_match(a, b, s) == true
which is required before calling score(a, b, s)
, but that invariant is not enforced by the API, and that has always made me a little queasy.
I disagree about the performance implications. I made this for use in text editor fuzzy finders, where it may be used to find across all file names in repository. That can be a very large haystack, and if the UI using the library updates on every keystroke, the latency matters. Adding an extra case conversion and string compare to score
has notable effects on performance.
This is also the reason for the division of has_match
and score
. The has_match
function is faster and will tend to remove most results before the slower score
function is applied. The filter
function is the safe version which does both, but the others can be useful on their own, so they remain in the public API.
I hadn't considered calling score
without has_match
. It looks like it returns fzy.get_score_min()
for non-matching needles, but I'd have to do a bit of work to be convinced that's true in all cases.
Any thoughts? Thanks for the feedback so far, and if you have any thoughts on how I should modify the documentation to make the oddness of score
stand out more, I'd appreciate it.
In the Lua implementation the perfect match scenario had a bug where it would consider any string of the same length as a perfect match. This fixes it.