NCATS-Gamma / robokop

Master UI for ROBOKOP
MIT License
16 stars 3 forks source link

Synonymize PR with chemicals? #359

Open cbizon opened 5 years ago

cbizon commented 5 years ago

@cbizon GO 'insulin binding' should have a 'has input' link to 'PR:000009054' (insulin): http://yasgui.org/short/huHKwSkLe

The trick will be to make ROBOKOP know what to do with PRO terms. There is a subclass of this term in PRO (not loaded in ubergraph) which is human insulin protein, and has a link to the HGNC gene ID. But since GO references the superclass, if you want to assume human it would take some additional work to know that you can go down to the subclass.

Originally posted by @balhoff in https://github.com/NCATS-Gamma/robokop/issues/353#issuecomment-497741523

cbizon commented 5 years ago

This is going to get uncomfortably tied up with the fact that we've identified genes and gene products.

cbizon commented 4 years ago

Even though the name of the ontology is Protein, the entries are mostly genes. Many are identified using UniProts, which as we discussed above, are really at a gene level. Now, there are a bunch of other entries that are higher level nodes. For instance, there is a human insulin, which is identical with a UniProtKB id (and hence an HGNC, etc). It is a child of "insulin" which has as other children orthologs of human insulin. And that insulin node is a child of an insulin family (also a PR node).

This is analogous to the way that e.g. chemical families are handled in chebi, but it's somewhat different from the way that gene families are handled in panther or hgnc, where the families are considered separate entities.