Open halukkaramete opened 2 years ago
you can use the sorter option and combine the length and the score https://github.com/jeancroy/FuzzySearch/blob/master/src/init.js#L52
I'll be honest, it looks like you want to recommend short paragraph given a thematic. This library was more about find a needle in a haystack.
Rigth now machine learning as a service is ripe enough that it may interest you. See for example https://docs.microsoft.com/en-us/azure/cognitive-services/language-service/question-answering/overview
For those who do not know how to do sorting based on size, using the "sorter" functionality...
Add this to your option when setting up your FuzzySearch obj.
sorter: myFunction,
and then provide this somewhere on your page
function myFunction(a, b) {
// when 2 items are equal in score, the shorter ones will rise above the longer ones
// if you do not use this function, sorting is done by alpha ( which is the default)
var d = b.score - a.score;
if (d !== 0) return d;
// var d = a.item.length - b.item.length;
var ak = a.item.length, bk = b.item.length;
return ak > bk ? 1 : ( ak < bk ? -1 : 0);
}
I'll be honest, it looks like you want to recommend short paragraph given a thematic. This library was more about find a needle in a haystack.
That's an entirely different take. I'm ok with using your library. I will work out the json so the searches will be done in only on signal words ( excluding the English Stop words ), which are stemmed ( using Porter2 ) along with Synonyms. What I'm working on is one of a kind when it comes to this subject and I'd like to use your library. Once I launch this, it will be used by millions of people.
if (d !== 0) return d;
I'd use something like abs(d) < 0.1
or d*d < 0.01
The thing is the score is a float, but you may find two results are similar enough to start giving importance to overall size. I have not tested 0.1 you may find something better for your taste.
Keep the good work then, I see the subject matter is important.
Is there a built-in option so that when experimented with it, the density of the number of matching characters ( I mean the highlighted matches in red ) is honored more compared to the lengthier entries? I think that would increase the relevancy automatically.
Example screenshot:
Here, I searched for Hajj ... ( in 1.8Mb 13,000 items file ) And the most top of the line entries ( which are below ) ended up around at 100th or so in the suggested items.
They are pretty short and bingo like matches yet plenty of long ones were preceding them.
How can I easily rise them to the top? Or at least near to the top?
Here are the winners...
Here are the poor little ones being crashed by the winners:
Clearly, the red density on those short ones are noticeably higher.
Especially this guy:
What do you think Jean?
I have a feeling this has to do with your compare method.
The solution may be it, but if so, how do I create that comparison?
I use
Must see it in practice, but what I'm asking for could create a tremendous difference in quality especially when main topics and sub-topics are searched like in my case.