olivernn / lunr.js

A bit like Solr, but much smaller and not as bright
http://lunrjs.com
MIT License
8.96k stars 548 forks source link

Exact matches should have a (slightly) higher score #10

Closed ssured closed 11 years ago

ssured commented 11 years ago

Thanks for this great project!

Currently searching 'hand' in a set of 'hand' and 'handsome' returns both with the same score. Obviously 'hand' should have a higher score than 'handsome'

To test:

testIndex = lunr(function(){this.field('name'); this.ref('id');});
testIndex.add({id:'hs',name:'handsome'});
testIndex.add({id:'hd',name:'hand'});
testIndex.search('hand');

results in

[   Object
    ref: "hd"
    score: 0.7071067811865476
, 
    Object
    ref: "hs"
    score: 0.7071067811865476
]
ssured commented 11 years ago

A possible solution here could be to sort equal scores to the shortest data set first, longest data set last. This way short texts have a slight preference over longer texts

olivernn commented 11 years ago

Yes, by default lunr is doing a prefix search for your search term, currently the terms that are returned from expanding a query are given exactly the same weighting as the terms that were in the original query.

What I think should happen is that, any term that is in the original query should be given a boost so that it contributes more to the resulting similarity score than the query terms that are part of the query expansion.

I'll have a look at the best way to implement this and let you know when I have something worth trying out.

ssured commented 11 years ago

Thanks for looking into this. I think it will make the perception of the quality of results a lot higher!

olivernn commented 11 years ago

I've just pushed a new version, 0.2.2, which includes a fix for this.

ssured commented 11 years ago

Works like a charm! Great!