Closed simone-b closed 3 months ago
Hi @simone-b,
I guess you have a good point here. The initial implementation had this part by design because a lot of users wanted that if you use fuzziness and you type something that has an exact match it should return it, if it doesn't find anything, fuzziness kicks in. If the exact match isn't returned right away than the results from fuzziness might not return it as the first occurrence.
I guess we have to rethink this a little bit, to see what would be the best solution. Maybe to add an option, additional parameter or something
Also, the fuzziness works best if no stemmer is set. Thats why the words julien
and julie
have different stems, they become julien
and juli
. Without the stemmer they would be untouched.
Best fuzzy settings for you would be:
'prefix_length' => 2,
'max_expansions' => 500,
'distance' => 3
Hi @nticaric,
I think, if a sort by levenshtein distance is applied on the results, the exact match would be returned in first everytime. The other results would just be here to complete the search, in case the user search something else. And i think it would be more "logical" but that's just my opinion.
But i agree that adding an option would let the choice to the developer.
Thanks for the settings, my previous settings were just a test :)
Well, thanks for your time and your answer. If i can help in any way that would be with pleasure !
Just updated the TNTSearch core, try updating it and let me know how it works
https://github.com/teamtnt/tntsearch/commit/40c4395cf04e089b6f2fd4dd98d4b1d09de1ce6b
Hi,
I updated the code on my project.
The sort works, but it seems that SQLite "cancels" the sort.
In the method TNTSearch@getAllDocumentsForFuzzyKeyword
you get the docs from the results of the search (if i understand it well). And the problem is that SQLite returns it ordered by primary_key
.
So the order is not kept.
I did a test, and this code works :
private function getAllDocumentsForFuzzyKeyword($words, $noLimit)
{
$binding_params = implode(',', array_fill(0, count($words), '?'));
$query = "SELECT * FROM doclist WHERE term_id in ($binding_params) ORDER BY hit_count DESC LIMIT {$this->maxDocs}";
if ($noLimit) {
$query = "SELECT * FROM doclist WHERE term_id in ($binding_params) ORDER BY hit_count DESC";
}
$stmtDoc = $this->index->prepare($query);
$ids = null;
foreach ($words as $word) {
$ids[] = $word['id'];
}
$stmtDoc->execute($ids);
$docs = $stmtDoc->fetchAll(PDO::FETCH_ASSOC);
$final_array = [];
foreach ($words as $word) {
foreach ($docs as $doc) {
if ($doc['term_id'] == $word['id']) {
$final_array[] = $doc;
}
}
}
return (new Collection($final_array));
}
I think it's possible to optimize it.
Hi @nticaric !
I come to get some news about it, what do you think about my previous comment ?
I don't think so that my solution is the most optimized one, but i can rework it and send a merge request on tntsearch core if you want.
@simone-b Thanks for submitting a solution. I was really busy those days, will take a look at it tonight
This issue is solved. Here is my PR.
Hi,
I have found out that even with FuzzySearch set to
on
there is a case in which Tntsearch for Laravel won't perform a FuzzySearch.For example :
Let's say that i have these records in my database :
With these FuzzySearch settings :
If my search query is :
julien s
, the engine will find both results. First, with the first keyword (julien
), it will find only the first user (id:1). Then with, the second one, it will find both.My problem is that the results found with the first keyword should contains more than one item.
When i do my FuzzySearch with
julien
keyword only, i expect to have more than one result. I expect to get all the results that hasdistance <= tntsearch.fuzzy.distance
ordered by distance ASC.But in facts i only get the results which fits exactly with my keyword. So the FuzzySearch is not even launched.
I took a look deeper in the code, and i found out that in :
TNTSearch@getWordlistByKeyword
there is a condition line : 339if ($this->fuzziness && !isset($res[0]))
which is the cause of it. When i activate the FuzzySearch i expect the FuzzySearch to be the main way to search. But in fact, the FuzzySearch is only launched if no results are found by SQLite comparison engine (i mean, inside the indexes).To make it work, i think that the conditions in the method
TNTSearch@getWordlistByKeyword
should be changed to use the FuzzySearch as the main way to search if the FuzzySearch is set totrue
. Then insideTNTSearch@fuzzySearch
an ASC sort by distance should be added on the match results.I might be using the package bad, but after a lot of research i didn't found anything regarding this issue.
Did i miss something on the way to use the package or is it an unknown error ?
Please excuse my english, and thanks for reading !
Have a nice day,
Simone-b