monarch-initiative / biolink-api

API for linked biological knowledge
https://api.monarchinitiative.org/api/
BSD 3-Clause "New" or "Revised" License
63 stars 25 forks source link

Need functions for determining term-to-term relatedness #206

Open selewis opened 6 years ago

selewis commented 6 years ago

Annotations vary considerably in precision, but for conciseness and to determine coverage, we need to be able to answer basic graph traversal questions. For example, given two terms is one of them a subclass of the other? Or what is the closest common parent term of two terms. Right now this functionality is missing and we're dealing with work-arounds or it's completely holding things up.

lpalbou commented 6 years ago

Hi Suzy,

A partial solution to your problem is: https://api.geneontology.cloud/go/GO_0060070/hierarchy which indicates both the parents & children of one GO term (not two). I plan also to make a more general one /relationship to explore all other term-to-term relatedness, but I could also create a path where you could ask the same question with two GO terms instead of one.

And I am finishing the transfer of this API into BioLink too.

Laurent-Philippe

On Fri, Aug 3, 2018 at 12:52 PM, Suzanna Lewis notifications@github.com wrote:

Annotations vary considerably in precision, but for conciseness and to determine coverage, we need to be able to answer basic graph traversal questions. For example, given two terms is one of them a subclass of the other? Or what is the closest common parent term of two terms. Right now this functionality is missing and we're dealing with work-arounds or it's completely holding things up.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/biolink/biolink-api/issues/206, or mute the thread https://github.com/notifications/unsubscribe-auth/AXIGDu4BhjhjpoHfViIgRVj8Hgvv8o8Zks5uNKoMgaJpZM4Vuh_t .

selewis commented 6 years ago

Quite nice, but doesn't really quite fit the bill yet. You would still have to traverse the graph in this JSON structure to answer the simple t/f question of 'is A a subclass of B' or conversely 'is B a subclass of A'. Plus would also be useful to have 'what is the closest parental term shared by A and B'. Burying all of the repetitive traversal stuff down inside the server code.

Be great to have this in BioLink

lpalbou commented 6 years ago

Correct, this query is for general purpose but I should be able to create the two specific queries you mentioned by next week.

selewis commented 6 years ago

Is this just is_a relations?

Also need to know if a term is flagged as 'do_not_manually_annotate' or 'do_not_annotate'

selewis commented 6 years ago

In Biolink?

On Fri, Aug 3, 2018 at 1:47 PM lpalbou notifications@github.com wrote:

Correct, this query is for general purpose but I should be able to create the two specific queries you mentioned by next week.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biolink/biolink-api/issues/206#issuecomment-410371959, or mute the thread https://github.com/notifications/unsubscribe-auth/ABcuENQbl7kxnT_F4CsVr_KGOF7TWFlXks5uNLbQgaJpZM4Vuh_t .

deepakunni3 commented 6 years ago

@selewis Yes, @lpalbou and I had a quick chat.

We can add couple of routes to biolink-api that gives a more direct answer as opposed to the JSON graph normally returned.

lpalbou commented 6 years ago

@selewis sorry, I am a bit late on this but I have deployed a route this morning to answer your first question:

http://api.geneontology.cloud/association/subclass/{goid1}/{goid2} => return true if and only if goid1 is_a or part_of goid2 (the question is oriented)

I have also deployed a sharedclass route: http://api.geneontology.cloud/association/sharedclass/{goid1}/{goid2} => return the terms (derived from is_a and part_of) that two terms share

To answer the closest common parent of two terms, do you want parents from both is_a and part_of relations ? Note this query could return several parents (example)

I am waiting for a PR on ontobio (https://github.com/biolink/ontobio/pull/217) but if this looks good to you, I'll do a second PR to deploy these routes on BioLink. Following BioLink syntax, they will be mapped respectively to (@cmungall your opinion ?) :

Notes:

selewis commented 6 years ago

Be nice if the first one would provide a way to indicate which relationships to follow. Like Deepak (I think) did for the slimmer code.

For the second, yes return all of them. If possible it would be useful to know the route taken to get there for each of the two children.

lpalbou commented 6 years ago

@selewis I also saw your question about 'do_not_manually_annotate' or 'do_not_annotate' tags. There is no specific route for this question only, but you can see if those tags are present in the subsets section of this general go-term query: https://api.geneontology.cloud/go/GO_0036288

(will be available on BioLink when PRs merged)

lpalbou commented 6 years ago

@selewis I have updated the API to be more consistent with BioLink syntax and to determine if two terms are related for any of is_a, part_of or regulates relationships: