Canadensys / vascan

The Database of Vascular Plants of Canada
MIT License
2 stars 2 forks source link

Search scoring issue when mixing taxon and vernacular names #1

Closed cgendreau closed 11 years ago

cgendreau commented 11 years ago

Search for "carex" will return the vernacular "carex de Richardson" with a better score than the taxon "Carex capitata". They should get the same score. They differ on document term frequency returned by the explain : idf(docFreq=1334, maxDocs=57595) for taxon and idf(docFreq=366, maxDocs=57595) for vernacular. We should ignore the document term frequency for the needs of Vascan. We had no luck using "omit_norms" : true and "index_options" : "docs" on the related fields.

cgendreau commented 11 years ago

See https://groups.google.com/forum/?fromgroups=#!topic/elasticsearch/VmhBrDVmAzQ

cgendreau commented 11 years ago

Fixed using constant score queries.