apache / lucenenet

Apache Lucene.NET
https://lucenenet.apache.org/
Apache License 2.0
2.18k stars 628 forks source link

FuzzyQuery produces a wrong result when prefix is equal to the term length #941

Open tohidemyname opened 1 month ago

tohidemyname commented 1 month ago

Is there an existing issue for this?

Describe the bug

When using FuzzyQuery the search string bba does not match doc value bbab with an edit distance of 1 and prefix length of 3.

In FuzzyQuery an automaton is created for the "suffix" part of the search string which in this case is an empty string.

Expected Behavior

In this scenario maybe the FuzzyQuery should rewrite to a WildcardQuery of the following form :

searchString + "?" 

where there's an appropriate number of ? characters according to the edit distance.

Steps To Reproduce

No response

Exceptions (if any)

No response

Lucene.NET Version

No response

.NET Version

No response

Operating System

No response

Anything else?

No response

tohidemyname commented 1 month ago

I just submitted my patch: https://github.com/apache/lucenenet/pull/942/commits

tohidemyname commented 1 month ago

I fixed two bugs. To submit separate pull requests, I reverted my changes after I fixed a bug. I am unsure whether I wrongly close my pull request or not. I have submitted another pull request:

https://github.com/apache/lucenenet/pull/945