NCATSTranslator / Feedback

A repo for tracking gaps in Translator data and finding ways to fill them.
7 stars 0 forks source link

Remove has_part / part_of edges - "charcoal, activated", aka carbon atom #863

Closed gglusman closed 4 weeks ago

gglusman commented 2 months ago

When revisiting what drugs may treat AHC, 471 results include a path that looks like this: image

The intermediate node labeled as "Charcoal, Activated" is, according to ChEBI, "carbon atom". Thus, the logic of this path is that since flunarizine is being studied to treat AHC, and flunarizine has a carbon atom, any molecule that also happens to have a carbon atom may treat AHC. This even includes charcoal itself: image There is even this path, inferring that you can somehow treat AHC with carbon atoms: image

Things that can be done here: 1) Correct the preferred name for the carbon atom. It is very misleading to use "activated charcoal" in its place. Flunarizine doesn't have activated charcoal in it. 2) Blocklist that entity entirely. 3) Stop using this ludicrous reasoning pattern - see https://github.com/NCATSTranslator/Feedback/issues/468#issuecomment-1674978797, https://github.com/NCATSTranslator/Feedback/issues/468#issuecomment-1901240785 and many more instances in many issues. I know, the score contribution of these paths has been reduced, but they still lead to hundreds of pointless "results" that are just noise. 4) Use that reasoning pattern only when really warranted. If the desire to keep it is because "sometimes it is correct", it would make sense to identify those instances in which it is correct, and encode those into valuable reasoning rules.

gaurav commented 2 months ago

Correct the preferred name for the carbon atom. It is very misleading to use "activated charcoal" in its place. Flunarizine doesn't have activated charcoal in it.

Thanks for reporting this! We're tracking this at https://github.com/TranslatorSRI/Babel/issues/304 and I'm hoping to have that fixed in Guppy.

sstemann commented 1 month ago

i think this ticket is less about carbon atom being called Charcoal and more about reasoning that results in 471 and more about reasoning that results in 471 results all with the same node in the path. WHile this started with Charcoal you can see the same thing if you run this same query

and free-text search "flunarizine"

471 results that say

result has part charcoal which is part of flunarizine is flooding the results and the value of that reasoning should be analyzed.

https://ui.test.transltr.io/main/results?l=Alternating%20Hemiplegia%20Of%20Childhood&i=MONDO:0016241&t=0&r=0&q=64aee208-61a1-4084-94fa-f65fe797aa17

image

image

This reasoning issue is not resolved in Fugu/Test

sierra-moxon commented 1 month ago

from TAQA: will follow up in Aragorn; the cache was flushed yesterday, we will test. we need to flush cache in PROD too. the "fix" will be for "has part"/"part of" reasoning path.

More discussion on a related use case was discussed - Chris B will take a look to see if there is some curation of the predicted reasoning paths can be done. Also we can tweak the scoring (the related case is returning a score of "5" by removing from the scoring mechanism) We also see the arrow direction in the UI implying causality (because the predicate is inverted).

cbizon commented 1 month ago

@sierra-moxon do you know what the related use case was? I think it was "decreases DHODH". When I looked though, I think that the offending paths were coming from Improving rather than Ranking (I just assumed they were ours). but maybe I was looking at the wrong thing...

cbizon commented 1 month ago

After clearing the cache, I reran this one on TEST, PK: f29b892f-33cb-46f1-8129-98e2c9207936. The has_part / part_of edges are removed.

sierra-moxon commented 4 weeks ago

closing as complete!