NCATSTranslator / Feedback

A repo for tracking gaps in Translator data and finding ways to fill them.
7 stars 0 forks source link

causes, does not treat #362

Open gglusman opened 1 year ago

gglusman commented 1 year ago

https://ui.test.transltr.io/results?l=Malignant%20Hyperthermia%20Of%20Anesthesia&t=0&q=c4d77595-4f01-4f92-9c77-9e7081bbf410

(Disclosure: long ago, a cousin of mine died from malignant hyperthermia on the operating bed.)

What drugs may treat malignant hyperthermia? At the top of the list is Halothane, with score 100. One path directly claims 'Halothane treats Malignant Hyperthermia of Anesthesia'. Another path states that 'Halothane is similar to Isoflurane, which treats Malignant Hyperthermia of Anesthesia'.

The problem is, again, that both halothane and isoflurane cause the malignant hyperthermia in individuals with relevant RYR1 mutations. The literature is clear on this. The abstract of one of the papers supposedly 'supporting' the assertion starts with this statement: "Malignant hyperthermia (MH) is a pharmacogenetic disorder of skeletal muscle that presents as a hypermetabolic response to potent volatile anesthetic gases such as halothane, [...]"

The good news: Dantrolene scores highly at 99.8.

gglusman commented 1 year ago

Similarly in https://ui.test.transltr.io/results?l=Scotoma&t=0&q=b764aae8-ab32-4e3f-b4a5-7c8539b39052 The top result for 'what drugs may treat scotoma' is Prednisone. Most of the paths link prednisone to glaucoma, and then 'glaucoma causes scotoma'. The problem is, prednisone causes the glaucoma and the scotoma. See https://www.ncbi.nlm.nih.gov/books/NBK430903

The next result for the same query is vigabatrin, an anticonvulsant... which also has negative effects on vision.

cbizon commented 1 year ago

The Halothane / Malignant Hypothermia edge is textmined, and appears to be coming from both TMKP and Semmed.

The sentences don't look very good, but there are a lot of them. So I doubt that a count filter or a domain/range constraint is going to pick this up.

I'm starting to wonder if we need some kind of adverse event / contraindication filter.

cbizon commented 1 year ago
  1. TMKP and Semmed are going to see if they can use counter-edges (treats vs causes) to see if they can filter , improve, or retrain these?
  2. Do we have structured data already that we can use as a filter? If not, what could we use, and who can bring it in?
cbizon commented 1 year ago
  1. also new treats edge in biolink will help, but is for the future
  2. Handling conflicting evidence is part of what we need to do
cbizon commented 1 year ago

I will investigate what structured data relating to causes is already available to us.

gglusman commented 1 year ago

Some points from the discussion:

Sui pointed out that for this case and similar, there's ground truth we should be introducing from appropriate sources.

Translator indeed should prioritize those for retrieval mode, but ground truth could be used to train/verify results of creative mode too. Bill: We use that information for future training of models, but we also exclude flagged sentences from our KG export, so the next time we do an export those flagged sentences won’t be included as evidence.

We need to find a way to represent 'there is conflicting evidence'.

Perhaps 'X causes Y' could be used as strong 'X does NOT treat Y' - bringing back the negation issue we've been punting.

In cases of conflict, a human needs to be in the loop to curate. We could use a mechanism for adding manual curation that overrides auto-computed content.

Andrew: I actually don't think that "opposite direction" is a huge issue. I think domain experts. I think we need to educate people that incorrect directionality is a common failure mode for many computational processes (for text mining, for reasoning chains, etc.). And personally, I think that Translator hits a relevant bit of biology in the wrong direction is better than hitting a completely unrelated concept... We aren't building a research tool, not an automated physician...

andrewsu commented 1 year ago

Regarding point 1 above: the question is whether we should have a filter in text-mined resources to remove X - TREATS - Y when there are many more PMIDs associated with X - CAUSES - Y. This intuition seems to hold for the Halothane / malignant hyperthermia example. Semmeddb currently has 3 PMIDs for TREATS and 32 PMIDs for CAUSES. (link)

I will investigate what structured data relating to causes is already available to us.

On this point, I'll just note that DrugCentral notes that malignant hyperthermia is a contraindication of halothane (link). Not exactly a CAUSES predicate, but perhaps a useful resource.

andrewsu commented 1 year ago

I think for the purposes of issue tracking for SemMedDB, this example has been translated into a new issue #392 so I'm going to remove the EPIC: SEMMEDDB/text mining label and unassign myself. But leaving it open in case anyone wants to follow up on any of the other ideas raised...)

cbizon commented 1 year ago

Leaving aside the semmed issue, which doesn't have much movement, it seems like there may be a generic notion for handling countervailing evidence. I don't see this as a pre-september task. Leaving the issue open, and pushing it to archtecture agenda, since we'll need to figure out if this is an ARA or higher-level task?

gglusman commented 1 year ago

Update on the halothane / malignant hyperthermia example: upon retesting (ci 8/17) halothane now has a much lower score of 13.5. The only support provided is a paper that doesn't explicitly say causes in the abstract, nor does it say treats... as it's so obvious to the reader that halothane is causal.

sandrine-muller-research commented 11 months ago

Retested today: Halothane comes with 4.28 isofluranecomes with 4.17