Closed deepakunni3 closed 5 years ago
Looks great! For genes we'll also want the count of phenotypes linked to all orthologs, and diseases associated with orthologs. If you're testing with our dev index these won't show up since a bad version of panther made it through.
@kshefchek I guess that will have to be a separate Solr query, right? Just wondering if it can be done in a more efficient way.
For stats via orthology, I think we could get the breakdown by type (gene, phenotype, disease, function) and for each type get the stats per taxon. Or alternatively we could get the breakdown of taxon per relation. The later would allow us to disambiguate gene-gene data (intereraction vs orthology) but is harder to process.
The latter is more informative, but we would need to disambiguate things like has_phenotype (sometimes overloaded and used for gene-disease, disease-phenotype)
This call would include everything (type per taxa per relation) but takes longer to finish: https://solr.monarchinitiative.org/solr/golr/select/?defType=edismax&qt=standard&indent=on&wt=json&rows=0&start=0&fl=*,score&facet=true&facet.mincount=1&json.nl=arrarr&facet.limit=20&facet.method=enum&fq=subject_ortholog_closure:%22MGI:98297%22&q=*:*&stats=true&stats.field={!tag=piv1%20calcdistinct=true%20distinctValues=false}object&facet.pivot={!stats=piv1}relation_label,subject_taxon,object_category
I think for now let's go with a simple call (eg category per relation as you have done previously) and have a separate call for the more complex query.
Added counts for ortholog associations using suggestions made by @kshefchek
New response looks like so:
{
"taxon": {
"id": "NCBITaxon:9606",
"label": "Homo sapiens"
},
"association_counts": {
"interactions": 58,
"homologs": 51,
"phenotypes": 84,
"anatomy": 20,
"functions": 14,
"pathways": 6,
"diseases": 4,
"publications": 32,
"variants": 13,
"ortholog-interactions": 104,
"ortholog-anatomy": 24,
"ortholog-functions": 17,
"ortholog-phenotypes": 18,
"ortholog-pathways": 6
},
"xrefs": null,
"description": null,
"categories": [
"gene",
"sequence feature"
],
"types": null,
"synonyms": null,
"deprecated": null,
"replaced_by": null,
"consider": null,
"id": "HGNC:18603",
"label": "COL25A1"
}
@kshefchek Could you take a look at this? I think all the necessary counts are being returned. Wanted to see if I am interpreting the counts properly.
Note: To get this to work you would have to use ontobio@master
+1, thanks for adding this!
Awesome! 👍
Add ability to get association counts for:
/bioentity/<id>
bioentity/<type>/<id>
The response looks like so:
Note: This PR is experimental and would like feedback from @cmungall @kshefchek @putmantime