Closed cmungall closed 6 years ago
@cmungall this is related to the default species parameter setting in mygene.info API.
MyGene.info APIs support a "species" parameter to filter the returned genes by species. For the query endpoint (/v3/query), it has the default value of "human,mouse,rat". That's why the above query does not return you anything. But adding "species=pig" (or simply "species=all") should give what you want:
http://mygene.info/v3/query?q=A0A075B7H6&species=pig http://mygene.info/v3/query?q=ENSSSCG00000030825&species=pig
There is a debate whether we should just set the default "species" to "all", so that your above queries will work without passing "species". The initial reason for the default of "human,mouse,rat" is just to avoid returning too many matched genes from all species (e.g. ?q=cdk2&species=all). We think (at least at the time when we made that decision) that is not what most of our users want, and "human,mouse,rat" are still the most commonly used default species for our users. But we like to hear from our users, and can change the default behavior if users want the other way.
Ref: http://docs.mygene.info/en/latest/doc/query_service.html#id8
got it, didn't RTFM closely enough.
The initial reason for the default of "human,mouse,rat" is just to avoid returning too many matched genes from all species (e.g. ?q=cdk2&species=all)
One approach would be to page results, but boost the favored species to the top of the list (there is nothing worse than people having to go to the Nth page of results to get to the first human gene - I know because we've accidentally implemented things that way before!)
As I mentioned to @newgene a few minutes ago, I vote in favor of changing that default behavior to search all species (while boosting human and common model organisms, which I think we might already do...) I think it made sense at one point, but no longer...
@cmungall yes, we are actually doing that already, like human>mouse>rat>other species in the order of the returned hits. For a query like q=cdk2, symbol match>name match, etc. This probably another reason we should switch to "species=all" by default now. I'm also in favor of this change now.
We have now switched to "species=all" as the default in our recent release:
http://biothings.io/new-default-behavior-for-species-parameter/
E.g.
mygene.info/v3/query?q=A0A075B7H6 mygene.info/v3/query?q=ENSSSCG00000030825
Don't return results