olivernn / lunr.js

A bit like Solr, but much smaller and not as bright
http://lunrjs.com
MIT License
8.94k stars 548 forks source link

Lunr should preserves the original sort order of equal elements #399

Closed sosharma1403 closed 5 years ago

sosharma1403 commented 5 years ago

Fiddle link: https://jsfiddle.net/zkqy91vh/5/

Expected result: query: 20 expected result: 2020, 2019, 2021, 2022

Elements having equal score should be ordered according to the original document

olivernn commented 5 years ago

The results I see are:

0.9073499876012563 -- 2020
0.6048999917341709 -- 2022
0.6048999917341709 -- 2021
0.6048999917341709 -- 2019

"2020" makes sense to come first since it has two instances of "20".

The other results are being returned in the reverse order of which they were added, in my browser anyway.

The document refs are stored in an object inside Lunr, which as far as I know is an unordered collection of properties.

Perhaps you could elaborate on your use case that requires this specific sorting of results? As it stands I don't think this is something that Lunr should support, but I'd be interested to hear why its useful in your case.

PandaWood commented 5 years ago

I have a similar case - I hope it's OK to give it here and back the original author.

I'm expecting some order in my results below, based on the search term "cmem" - either alphabetical or even order added (so I can perhaps control it to be alphabetical - and that is what the original author seems to be asking for as well)

1.0477175191815777 -- CMEM 0.5238587595907889 -- CMEMOFFSET 0.5238587595907889 -- CMEMINDEX 0.5238587595907889 -- CMEMCHILD

https://jsfiddle.net/xst7La23/

In the case where I have 20 or 30 such results that start with "CMEM" the lack of any obvious sorting results means the results are frustrating at best.

I guess the question for Lunr.js is: should it use order in the results supplied, and give some basic predictability or should that be totally up to the user of the library?

olivernn commented 5 years ago

As mentioned previously, returning documents in the order they were added seems impractical due to the nature of the data structures used in Lunr. Also, ensuring this consistently before and after serialisation and potentially across vastly different devices means that I think it is a non-starter.

You mention alphabetical order, this might be possible, but only on the document reference. Of course this doesn't need to be done by Lunr, though it would be more efficient if it was.

This would be reasonably easy to implement, just augmenting the current sort functionality to use the ref as tie-breaker. It would need to have minimal impact on the performance for all other use cases though.

olivernn commented 5 years ago

Closing this now, if someone wants to provide a PR with the changes mentioned to get sorting on the document reference I'll gladly take a look.