monarch-initiative / biolink-api

API for linked biological knowledge
https://api.monarchinitiative.org/api/
BSD 3-Clause "New" or "Revised" License
64 stars 25 forks source link

ribbon / slimmer: gene lookup for UniProtKB IDs should grab human genes #207

Closed nathandunn closed 6 years ago

nathandunn commented 6 years ago

https://github.com/geneontology/ribbon/issues/47

deepakunni3 commented 6 years ago

Can you add more information about this issue. Is the idea to add a new call to BioLink API?

selewis commented 6 years ago

No, it's a matter of what kind of entity is being referred to, the "gene" is a kind of entity that is thing made of nucleic acid - from it is derived (translated) a "protein" entity a thing made of amino acids. Colloquially they are often used interchangeably. For protein coding genes there should be at least one corresponding protein.

In GO some groups (MODs mostly) annotate to genes, but UniProt annotates to protein (for human, but also for lots of other species). We might query with an HGNC gene ID, but what has been annotated by GO for that gene is the corresponding protein (uniprot ID). So we'd locate the protein for that HGNC gene to gather the annotations, and then return them as the annotation to the gene.

selewis commented 6 years ago

Of course they return protein IDs for other species, because they annotated all sorts of creatures and plants. As I said, we just need to map from the HGNC gene ID (which the user enters) to the corresponding protein ID (for human) to get the annotations, and then restore the HGNC id used in the query.

For the based on (aka with, aka evidence_with) we don't know ahead of time whether the protein ID is human or some other taxon. That's why having the fuller entries returned in the JSON field "evidence_with" that are constructed like the "subject" will be helpful.

nathandunn commented 6 years ago

Deleted my comment. I couldn't recollect the exact bug. Thanks for the clarification @selewis