NCATS-Tangerine / tkbio

NCATS Translator Release 4.0 of Knowledge.Bio as a distributed knowledge map management platform
http://tkbio.medgeninformatics.net
Other
4 stars 2 forks source link

Management of gene orthologs (Taxonomic context of concepts in general) #41

Open RichardBruskiewich opened 6 years ago

RichardBruskiewich commented 6 years ago

Gene symbols (e.g. SSH) are a bit degenerate with respect to ortholog loci. This presents a couple of challenges for TKBIO:

1) The nature of the differences is not explicit in the UI, that is, two identical symbol names may correspond with the loci from distinct taxa (species) but this is not easy to ascertain from the UI

2) strict concept equivalencies do not merge subgraphs anchored on orthologous loci, which albeit intellectually honest, looks odd on the concept map in that disconnected subgraphs appear with duplicated gene symbol labels.

Although not strictly aggregation of "equivalent concepts", it is nonetheless a comparable task of comparative functional genomics a la Eisen to merge gene concepts based on orthology.

Concurrently, though, it would be helpful to have some visible mechanism to display the taxon (species) of the (gene) concepts being displayed: using colour, tool tip labels, direct text labels, (?)

lhannest commented 6 years ago

At the moment you can view this (if the information is available) by clicking on the concept in the graph and bringing up its details. But I think adding to the label would be most clear, so that it would be "SSH1 [taxon: saccharomyces cerevisiae]", "SSH1 [taxon: homo sapiens]". Or maybe even just "SSH1 (homo sapiens)".

This could appear both in the data table and the graph. I think what I will do is that when concepts are being loaded (for both concept and statement searches), if the semantic type is GENE then it will also go and grab the concept details, and try to get the taxon. That being said, there is no particular way that the taxon will be contained in that details map. I will have to grab the first key that contains the string "taxon" and use its value.

Edit: We don't get a concepts semantic type in the statement's call, so we would have to do this for every concept. This sort of issue is starting to make graphql look much more attractive.

cmungall commented 6 years ago

I think that beacons should make best efforts to disambiguate shared labels. Gene beacons should append taxon info by default.

On 14 Aug 2017, at 11:01, Lance Hannestad wrote:

At the moment you can view this (if the information is available) by clicking on the concept in the graph and bringing up its details. But I think adding to the label would be most clear, so that it would be "SSH1 [taxon: saccharomyces cerevisiae]", "SSH1 [taxon: homo sapiens]". Or maybe even just "SSH1 (homo sapiens)".

This could appear both in the data table and the graph. I think what I will do is that when concepts are being loaded (for both concept and statement searches), if the semantic type is GENE then it will also go and grab the concept details, and try to get the taxon. That being said, there is no particular way that the taxon will be contained in that details map. I will have to grab the first key that contains the string "taxon" and use its value.

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/NCATS-Tangerine/tkbio/issues/41#issuecomment-322262957