warelab / gramene-solr

Apache License 2.0
0 stars 0 forks source link

Ordering of search results #3

Closed mycrobe closed 8 years ago

ajo2995 commented 8 years ago

Since we use filter queries for everything, the documents are by default returned in index order, that is, in the order you loaded them in to solr. One might prefer to have search results in a meaningful order.

We currently have two sets of gene annotations for zea mays and oryza sativa. The secondary set is coming from the otherfeatures ensembl db. These gene docs should appear after gene docs from the core ensembl db. The db_type field is either "core" or "otherfeatures".

I suggest we sort by species_idx, gene_idx, and db_type. (species_idx is not defined yet. Maybe arabidopsis, rice, maize, sorghum, others sorted by name?)

We can either modify the solr query to sort the response, or we can sort all the docs before loading them into solr. Since we don't do partial updates, the index order should never change for a release.

ajo2995 commented 8 years ago

This was fixed by indexing mongo docs by species_id, db_type, and gene_idx. genes/mongo2solr.js streams gene docs in order using this index.