monarch-initiative / biolink-api

API for linked biological knowledge
https://api.monarchinitiative.org/api/
BSD 3-Clause "New" or "Revised" License
63 stars 25 forks source link

GET /bioentity/gene/{id}/function/ has trouble with synonyms #118

Open cbizon opened 6 years ago

cbizon commented 6 years ago

If I query with id=UniProtKB:P10721 I get back a bunch of results.

But querying with id=NCBIGene:3815 (which should be equivalent) I get back


{
  "compact_associations": null,
  "associations": [],
  "objects": [],
  "facet_pivot": null,
  "start": null,
  "facet_counts": {
    "taxon_label": {},
    "isa_partof_closure": {}
  },
  "numFound": null
}```
jmcmurry commented 6 years ago

First, some background. In Monarch proteins and genes are related but not considered precisely equivalent.

Just to disambiguate the issues involved here:

We support equivalence of genes (in this case Orphanet:122862, KEGG-hsa:3815, HGNC:6342, ENSEMBL:ENSG00000157404, OMIM:164920), but protein products are another

cbizon commented 6 years ago

Yes, agreed on the distinction between gene and protein.

The only reason I tried UniProtKB at all was because of this comment in the api documentation:

"Additionally, for some species such as Human, GO has the annotation attached to the UniProt ID. Again, this should be transparently handled; e.g. you can use NCBIGene:6469, and this will be mapped behind the scenes for querying."

jmcmurry commented 6 years ago

I see; I didn't know that :) It sounds like we do not have any axioms relating this specific gene to this specific protein. Will ask @TomConlin to investigate.

TomConlin commented 6 years ago

guessing it will be related to https://github.com/monarch-initiative/dipper/pull/538/files#diff-b1901a5cdc0e5f9c2557ec04b699a958L1

previously NCBIGene would have been the clique leader

but this isn't a good reason for the query not to have returned with HGNC:6342

kshefchek commented 6 years ago

This query hits the gene ontology solr server, so I don't think the usual monarch considerations apply, cc @kltm

kltm commented 6 years ago

All checks we have for services are positive. This would be for @selewis or @cmungall .

selewis commented 6 years ago

Just to note that the clique leader ID for a (human) gene is HGNC (not NCBIGene), and for the (human) protein UniProtKB

On Mon, Nov 6, 2017 at 1:15 PM, kltm notifications@github.com wrote:

All checks we have for services are positive. This would be for @selewis https://github.com/selewis or @cmungall https://github.com/cmungall .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biolink/biolink-api/issues/118#issuecomment-342289013, or mute the thread https://github.com/notifications/unsubscribe-auth/ABcuEPDQZy9T2zrzJvYe8biYGQ7uLIp4ks5sz3bUgaJpZM4QTtR4 .