NCATSTranslator / Feedback

A repo for tracking gaps in Translator data and finding ways to fill them.
7 stars 0 forks source link

Curation Error in datasource Monarch #405

Open TranslatorIssueCreator opened 1 year ago

TranslatorIssueCreator commented 1 year ago

Type: Bug Report

URL: https://ui.ci.transltr.io/results?l=Hereditary%20Sensory%20And%20Autonomic%20Neuropathy%20Type%204&i=MONDO:0009746&t=0&q=d93346b4-38ba-4366-a3b7-4d47b9a0304d

ARS PK: 876c9eb4-1d88-47dd-9df1-8d3b2283c08a

Steps to reproduce:

MVP1 on CI env submitted by @sandrine-m

Screenshots:

sandrine-m commented 1 year ago

Result reproducible on test Result Botulinum toxin type a -- Hereditary sensory and autonomic neuropathy type 4

image

This result has hyperhydrosis as an intermediate node (which I was surprised about because Hereditary sensory and autonomic neuropathy type 4 is "highly" associated with anhidrosis). Comes from BioThings Explorer (first of their result according to ARAX GUI).

image image

Looking into it furher on Monarch:

image

Monarch states that the frequency is very rare.

I found a unique paper online (that is unclear how to get that directly from Monarch) that mentions: "Episodic hyperhydrosis as well as patchy areas of anhidrosis can occur in the same patient." but it is related to HSAN II and not HSAN IV.

andrewsu commented 1 year ago

From my perspective, this is a curation issue for the Monarch team. The Orphanet disease page for HSAN IV clearly describes anhidrosis as a phenotype, not hyperhidrosis (as @sandrine-m also noted). But that curation issue aside, everything downstream seems to be working as expected.

(I would propose changing the title because the publication that @sandrine-m cites https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2098750/ isn't actually reported by Translator that I can see. BTE reports infores:monarchinitiative as the primary_knowledge_source.)

sandrine-m commented 1 year ago

I agree with you @andrewsu, would this title be better?

sierra-moxon commented 1 year ago

@putmantime - do you think there is a curation error here?

putmantime commented 1 year ago

Just looked at this and it appears to be a curation error that HPO has in its annotations file that we use and we are passing it along. Reporting it to them now.

Screenshot 2023-07-18 at 12 35 09 PM

We map that ORPHA:642 to MONDO:0009746 on ingest to Monarch KG.

We will work with HPO to fix it and should be in a new build soon.

This does raise the issue that we soon need to work with @newgene to update BTE to get Monarch data from our new Graph and beta API https://api-v3.monarchinitiative.org/v3/docs

sierra-moxon commented 6 months ago

@kschaper - any idea if this is something the curation team can take a look at?

RichardBruskiewich commented 6 months ago

Hi @sierra-moxon, If I'm reading the end point of this issue correctly, the data curation may have already been done (by HPO?) with the repaired data ingested into the latest Monarch graph, then BTE needs to point to the latest Monarch API (suggested above to be https://api-v3.monarchinitiative.org/v3/docs.

Maybe @kschaper has to be the one to do (or delegate) this, since I am no longer a funded participant in the Monarch team...my budget ran out last January ;-(

Just looked at this and it appears to be a curation error that HPO has in its annotations file that we use and we are passing it along. Reporting it to them now.

Screenshot 2023-07-18 at 12 35 09 PM We map that ORPHA:642 to MONDO:0009746 on ingest to Monarch KG. We will work with HPO to fix it and should be in a new build soon.

This does raise the issue that we soon need to work with @newgene to update BTE to get Monarch data from our new Graph and beta API https://api-v3.monarchinitiative.org/v3/docs

kevinschaper commented 6 months ago

This page has Orphanet's phenotype list for ORPHA:642: https://www.orpha.net/en/disease/sign/642

I see both Anhidrosis in Very Frequent

Screenshot 2024-05-17 at 8 20 09 PM

and Hyperhidrosis in Very Rare

Screenshot 2024-05-17 at 8 21 07 PM

No publications to dig deeper though, sadly.

kevinschaper commented 6 months ago

I see that I currently have infores:hpo-annotations as the primary knowledge source, I'll fix that so that it's Orphanet.

How is the UI handling frequency qualifiers / quantifiers right now?

A bare biolink:has_phenotype predicate can turn out to be misleading in a surprising number of ways, between negated=True, frequency_qualifier=HP:0040284 Very Rare) or HP:0040285 Excluded (though I don't see that we have any annotations with Excluded). Or has_percentage being a low value. (I also don't see zeroes represented that way in HPOA data, thankfully, but I do see values less than 1%)

sstemann commented 3 months ago

I see that I currently have infores:hpo-annotations as the primary knowledge source, I'll fix that so that it's Orphanet.

How is the UI handling frequency qualifiers / quantifiers right now?

A bare biolink:has_phenotype predicate can turn out to be misleading in a surprising number of ways, between negated=True, frequency_qualifier=HP:0040284 Very Rare) or HP:0040285 Excluded (though I don't see that we have any annotations with Excluded). Or has_percentage being a low value. (I also don't see zeroes represented that way in HPOA data, thankfully, but I do see values less than 1%)

It looks like the edge is still from HPO, in Test and in Prod.

in Prod the overall score is botulinum toxin type A is 3.81.