monarch-initiative / mondo

Mondo Disease Ontology
http://obofoundry.org/ontology/mondo
Creative Commons Attribution 4.0 International
228 stars 53 forks source link

OMIM Phenoseries not captured #753

Closed pnrobinson closed 3 years ago

pnrobinson commented 5 years ago

I am trying to figure out how MONDO represents Phenoseries. Noonan syndrome 1 is OMIM:163950 and this corresponds to the phenoseries PS163950 https://omim.org/phenotypicSeries/PS163950

However, the entry for this in MONDO does not reference the grouping class id: MONDO:0018997 name: Noonan syndrome

It seems that most of the other Noonan subtypes are "is a" MONDO:0018997 However, Noonan 1 "is a" Noonan 7 (in the mondo.obo, which I just downloaded, but OLS displays it correctly).

Is this a bug or am I misunderstanding something?

[Term] id: MONDO:0008104 name: Noonan syndrome 1 def: "Noonan syndrome caused by mutations in the PTPN11 gene." [NCIT:C75459] synonym: "female pseudo-Turner syndrome" RELATED [OMIM:163950] synonym: "Male Turner syndrome" RELATED [OMIM:163950] synonym: "Noonan syndrome" RELATED [OMIM:163950] synonym: "Noonan syndrome 1" EXACT [MONDO:Lexical, OMIM:163950] synonym: "Noonan syndrome 1; NS1" RELATED [OMIM:163950] synonym: "Noonan syndrome type 1" EXACT [DOID:0060578, MONDORULE:1, OMIM:163950] synonym: "NS1" EXACT [DOID:0060578, MONDO:Lexical, OMIM:163950] synonym: "pterygium colli syndrome" RELATED [OMIM:163950] synonym: "Turner phenotype with Normal karyotype" RELATED [OMIM:163950] xref: DOID:0060578 {source="MONDO:equivalentTo"} xref: GARD:0007223 {source="MONDO:equivalentTo"} xref: NCIT:C75459 {source="MONDO:kboom-pr-1.00/0.79/5.41", source="MONDO:equivalentTo"} xref: OMIM:163950 {source="MONDO:equivalentTo", source="DOID:0060578"} is_a: MONDO:0013379 ! Noonan syndrome 7 is_a: MONDO:0017415 ! multiple pterygium syndrome property_value: closeMatch http://linkedlifedata.com/resource/umls/id/C0041409 property_value: closeMatch http://linkedlifedata.com/resource/umls/id/C1527404 property_value: exactMatch DOID:0060578 property_value: exactMatch http://identifiers.org/omim/163950 property_value: exactMatch NCIT:C75459

cmungall commented 5 years ago

Hi - I recommend the json or owl as a more robust way to get at this, but I am partial to a bit of obo hacking.

The hacky way to get this from the obo is via the xref

obo-grep.pl  -r 'xref: OMIMPS:163950' mondo.obo

[Term]
id: MONDO:0018997
name: Noonan syndrome
def: "Noonan Syndrome (NS) is characterised by short stature, typical facial dysmorphism and congenital heart defects." [Orphanet:648]
subset: clingen
subset: ordo_malformation_syndrome {source="Orphanet:648"}
synonym: "Noonan syndrome" EXACT [NCIT:C34854]
synonym: "Noonan's syndrome" EXACT [NCIT:C34854]
synonym: "Noonan-Ehmke syndrome" RELATED [GARD:0010955]
synonym: "pseudo-Ullrich-Turner syndrome" RELATED [GARD:0010955]
synonym: "Turner's phenotype, karyotype normal" EXACT [DOID:3490]
synonym: "Ullrich-Noonan syndrome" RELATED [GARD:0010955]
xref: DOID:3490 {source="MONDO:equivalentTo"}
xref: GARD:0010955 {source="MONDO:equivalentTo", source="Orphanet-shared"}
xref: ICD10:Q87.1 {source="Orphanet:648", source="ORDO:648/ntbt", source="DOID:3490", source="ORDO:648/inclusion"}
xref: ICD9:759.89 {source="MONDO:relatedTo", source="i2s"}
xref: MedDRA:10029748 {source="ORDO:648/e", source="Orphanet:648"}
xref: MESH:D009634 {source="ORDO:648/e", source="MONDO:equivalentTo", source="Orphanet:648", source="DOID:3490"}
xref: NCIT:C34854 {source="MONDO:equivalentTo", source="DOID:3490"}
xref: OMIMPS:163950 {source="MONDO:equivalentTo", source="DOID:3490"}
xref: Orphanet:648 {source="MONDO:equivalentTo", source="DOID:3490"}
xref: SCTID:205824006 {source="MONDO:equivalentTo", source="MONDO:kboom-pr-0.69/0.36/0.08", source="DOID:3490"}
xref: UMLS:C0028326 {source="ORDO:648/e", source="MONDO:equivalentTo", source="Orphanet:648", source="DOID:3490", source="NCIT:C34854"}

You can then traverse down to the subclasses over is_a

HTH

Seems like a standard report would be very useful here. What would the ideal format be?

Maybe

cmungall commented 5 years ago

oh wait sorry I just re-read I see the issue, hold on....

cmungall commented 5 years ago

OK, it looks like you have an old mondo file, that is-a makes no sense

This is what noonan 1 looks like now

[Term]
id: MONDO:0008104
name: Noonan syndrome 1
def: "Noonan syndrome caused by mutations in the PTPN11 gene." [NCIT:C75459]
synonym: "female pseudo-Turner syndrome" RELATED [OMIM:163950]
synonym: "Male Turner syndrome" RELATED [OMIM:163950]
synonym: "Noonan syndrome" RELATED [OMIM:163950]
synonym: "Noonan syndrome 1" EXACT [MONDO:Lexical, OMIM:163950]
synonym: "Noonan syndrome 1; NS1" RELATED [OMIM:163950]
synonym: "Noonan syndrome type 1" EXACT [DOID:0060578, MONDORULE:1, OMIM:163950]
synonym: "NS1" EXACT [DOID:0060578, MONDO:Lexical, OMIM:163950]
synonym: "pterygium colli syndrome" RELATED [OMIM:163950]
synonym: "Turner phenotype with Normal karyotype" RELATED [OMIM:163950]
xref: DOID:0060578 {source="MONDO:equivalentTo"}
xref: GARD:0007223 {source="MONDO:equivalentTo"}
xref: NCIT:C75459 {source="MONDO:kboom-pr-1.00/0.79/5.41", source="MONDO:equivalentTo"}
xref: OMIM:163950 {source="DOID:0060578", source="MONDO:equivalentTo"}
is_a: MONDO:0018997 {source="DC-OMIM:163950", source="DOID:0060578", source="NCIT:C75459", source="OMIM:163950"} ! Noonan syndrome
property_value: closeMatch http://linkedlifedata.com/resource/umls/id/C0041409
property_value: closeMatch http://linkedlifedata.com/resource/umls/id/C1527404
property_value: exactMatch DOID:0060578
property_value: exactMatch http://identifiers.org/omim/163950
property_value: exactMatch NCIT:C75459
pnrobinson commented 5 years ago

OK, thanks, this is easy with phenol, however, I would like to extract just those MONDO terms that are exact matches:

xref: OMIM:104110 {source="MONDO:equivalentTo"}

However, it seems that the obographs library does not pick this up (should it be in "trailing modifiers"?).

see https://github.com/monarch-initiative/phenol/blob/develop/phenol-cli/src/main/java/org/monarchinitiative/phenol/cli/demo/MondoDemo.java

pnrobinson commented 5 years ago

@cmungall This is where there is a problem -- obographs puts the exactMatch information in a separate map https://github.com/monarch-initiative/phenol/issues/223

cmungall commented 5 years ago

In fact there are 3 separate ways to get at this

Why 3? Well, there are a ton of applications that rely on xrefs so hard to get rid of them, but they are imprecise. In fact you can get away with just using the xref to the OMIMPS without the axiom annotation, since we force these to be 1:1 for the release. That may be your fastest way to get going.

But in general the way we want to guide people is on less ambiguous computable ways that state the meaning in a more standard way. There's basically two approaches in the broader community - skos properties, and OWL logical axioms. There's pros and cons of each. We provide both.

The OWL is most precise but not as useful for things like ICD where we might want to say "this is a close match but not really equivalent", which is not something you can say in DL, so skos can be useful here.

The OWL axiomatized mappings can be obtained from:

http://purl.obolibrary.org/obo/mondo/imports/equivalencies.json http://purl.obolibrary.org/obo/mondo/imports/equivalencies.owl

I think phenol should be able to handle these.

Or you can just use the exactMatch (skos properties) which you are exploring at the moment.

We definitely need better docs aimed at developers using MONDO!

I wrote an irreverent blog post about all the madness associated with different ways of doing mapping https://douroucouli.wordpress.com/2019/05/27/never-mind-the-logix-taming-the-semantic-anarchy-of-mappings-in-ontologie/

I am chatting with @simonjupp about ways to better standardize this across multiple ontologies so that tools like OXO work predictably.

pnrobinson commented 5 years ago

@cmungall -- thanks. If we are willing to trust MONDO, then the phenol app works already. However, obographs is not actually extracting the xref information in a useful way -- please see monarch-initiative/phenol#223. We could fix this in phenol in an ugly way but it would be nicer to put the fix in obographs -- I will take a look if I can find the time.

nicolevasilevsky commented 3 years ago

@pnrobinson is this still needed?

pnrobinson commented 3 years ago

I think this issue has gotten stale. We are not going to be using MONDO for Phenoseries. There are still issues with obographs etc but they do not need to be on this tracker.