Open DCGenomics opened 8 years ago
@lepons seems to have this mostly working as expected. Some background begins at https://github.com/NCBI-Hackathons/Metadata_categorization/issues/5#issuecomment-184471628.
Lena, one slight anomaly I see is that, in the annotation
Solr core, HEK293 is all in queue 4, whereas HEK293T is in queue 190 and queue 236. Any idea why HEK293 and HEK293T are so far apart, and why HEK239T is in non-adjacent queues?
http://localhost:8983/solr/annotation/select?q=sourceCellLine:HEK293&wt=json&indent=true http://localhost:8983/solr/annotation/select?q=sourceCellLine:HEK293T&wt=json&indent=true
Clustering is much better than before regardless. For example, docs with sourceCellLine: HEK293
are now all in the same queue. And most of the docs / individual records with the same queueId and sourceCellLine are on the same line in the UI (i.e. in the summary record), too. I'm looking into why only most and not all such records are in the same summary record. I suspect the cause is in the Django backend.
For example, all HEK293 variants need to be on the same line