monarch-initiative / mondo

Mondo Disease Ontology
http://obofoundry.org/ontology/mondo
Creative Commons Attribution 4.0 International
223 stars 53 forks source link

Review the only-ordo supported is-as relationships? #834

Closed nicolevasilevsky closed 3 years ago

nicolevasilevsky commented 4 years ago

In #824, there was a note about a need to review the only-ordo supported is-as relationships?

It also came up in https://github.com/EBISPOT/efo/issues/553 MONDO_0009061 'Cystic fibrosis': from @paolaroncaglia: of all its current parents in EFO, only 'autosomal recessive disease' should stay; all other parents should be deleted bcs they represent symptoms/phenotypes, CF is a multi-systemic disease.

paolaroncaglia commented 4 years ago

@nicolevasilevsky MONDO:0021001 hemochromatosis type 1 may be a similar case, see https://github.com/EBISPOT/efo/issues/568.

cmungall commented 4 years ago

I think you are on top of this, do you need me to do anything here?

nicolevasilevsky commented 4 years ago

@paolaroncaglia has been reporting individual issues, but do you think we should look at all of the only-ordo supported is-as relationships? Is there an easy way to generate a list of all of these terms/relationships?

paolaroncaglia commented 4 years ago

@zoependlington and I discussed the issue of incorrect Orphanet-derived superclasses with @nicolevasilevsky at our monthly EFO-Mondo call today. Zoe will try to query the problematic terms and their superclasses from Mondo for Nicole to look at. If necessary, we may ask Nico or Chris for help. Then we'll go from there.

paolaroncaglia commented 4 years ago

Here's another example of incorrect Orphanet-derived superclasses in Mondo (that EFO inherits too):

MONDO:0009169 endocardial fibroelastosis

comment: Editor notes: ORDO classifies as both familial and non-familial

def: Endomyocardial fibroelastosis is a cause of unexplained childhood cardiac insufficiency. It results from diffuse thickening of the endocardium leading to dilated myocardiopathy in the majority of cases and restrictive myocardiopathy in rare cases. It may occur as a primary disorder or may be secondary to another cardiac malformation, notably aortic stenosis or atresia.

Subclass of: familial restrictive cardiomyopathy endocardium disease non-familial restrictive cardiomyopathy familial dilated cardiomyopathy non-familial dilated cardiomyopathy

It follows from the def (and is supported by the editor's comment) that the only correct superclass is 'endocardium disease', and that all other superclasses should be deleted.

paolaroncaglia commented 4 years ago

See also https://github.com/monarch-initiative/mondo/issues/565

paolaroncaglia commented 4 years ago

(Re. reviewing Orphanet-derived incorrect parentage) While we wait for the agreed action item on EFO's part, I'll also make a note here that in some cases a disease term is correctly placed in EFO and has an EFO namespace, but gains incorrect Orphanet-derived parentage via Mondo because Mondo got the term directly from Orphanet. See e.g. 'Sneddon syndrome' that is a subclass of 'vascular disease' only in EFO, but gains 7 parents from Mondo, most of which look incorrect.

nicolevasilevsky commented 4 years ago

See spreadsheet

nicolevasilevsky commented 4 years ago

@cmungall could you do a SPARQL query to give us a list of all the Mondo classes that only have Orphanet as the source for the subClass Of assertion?

thank you!

nicolevasilevsky commented 4 years ago

@ShahimEssaid could you do a SPARQL query to give us a list of all the Mondo classes that only have Orphanet as the source for the subClassOf assertion.

Zoe and I are going to review the is_a overloading and create tickets as needed. She has a spreadsheet with over 6000 terms in EFO that come from Orphanet, and have multiple parents.

nicolevasilevsky commented 4 years ago

@ShahimEssaid I think we talked about this SPARQL query before the holidays but I can't remember where it stands now. Would you be able to do this soon? Thanks!

paolaroncaglia commented 4 years ago

@nicolevasilevsky @ShahimEssaid Hi, @zoependlington and I met with Open Targets today and they asked if there are any updates on this please. Many thanks!

ShahimEssaid commented 4 years ago

@nicolevasilevsky to make sure I get this right, if there is a Mondo class A that has more than one parent, lets say B and C, should A be listed in the result if ALL the subclass axioms are only from Orphanet or if ANY of the subclass axioms are only from Orphanet? In other words, if A subclassof B only comes from Orphanet but A subclassof C comes from somewhere else with or without Orphanet, should A be listed? This is the "ANY" case.

ShahimEssaid commented 4 years ago

Also, there are few subclassof axioms annotated with: source="MONDO:Entailed" source="MONDO:Redundant"

Would these axioms exclude the class since they aren't really Orphanet only?

About my earlier comment, this is an example of a Mondo class that has a "is_a" that is only coming from source="ORDO:*" but there is another "is_a" that is not only from ORDO. Should this class be in the result or not?


[Term]
id: MONDO:0007060
name: spermatogenic failure 6
def: "Any azoospermia in which the cause of the disease is a mutation in the SPATA16 gene." [MONDO:patterns/disease_series_by_gene]
synonym: "acrosome malformation of spermatozoa" RELATED [OMIM:102530]
synonym: "azoospermia caused by mutation in SPATA16" EXACT [MONDO:design_pattern]
synonym: "globozoospermia" RELATED [OMIM:102530]
synonym: "round-headed spermatozoa" RELATED [OMIM:102530]
synonym: "SPATA16 azoospermia" EXACT [MONDO:design_pattern, MONDO:patterns/disease_series_by_gene]
synonym: "spermatogenic failure 6" EXACT [MONDO:Lexical, OMIM:102530]
synonym: "spermatogenic failure 6; SPGF6" RELATED [OMIM:102530]
synonym: "spermatogenic failure type 6" EXACT [MONDORULE:1, OMIM:102530]
synonym: "spermatozoa, round-headed" RELATED [OMIM:102530]
synonym: "SPGF6" RELATED [MONDO:Lexical, OMIM:102530]
xref: DOID:0070167 {source="MONDO:equivalentTo"}
xref: OMIM:102530 {source="MONDO:equivalentTo"}
xref: Orphanet:171709 {source="MONDO:subClassOf", source="OMIM:102530"}
xref: Orphanet:399808 {source="MONDO:subClassOf", source="OMIM:102530"}
xref: SCTID:236818008 {source="MONDO:equivalentTo", source="MONDO:kboom-pr-1.00/0.74/5.95"}
xref: UMLS:C0403825 {source="NCBI:mim2gene_medline", source="MONDO:notFoundInDiseaseSubset", source="OMIM:102530"}

here:
is_a: MONDO:0004983 {source="DC-OMIM:102530", source="MONDO:Redundant", source="OMIM:102530"} ! azoospermia
is_a: MONDO:0015746 {source="ORDO:171709/btnt"} ! male infertility due to globozoospermia
paolaroncaglia commented 4 years ago

@ShahimEssaid @nicolevasilevsky Thanks Shahim for your feedback! Your example above highlights that, ultimately, we're interested in all cases where there is at least one "is_a" that is only coming from source="ORDO:*". ('azoospermia' is a correct parent, while 'male infertility due to globozoospermia' is likely incorrect and should be reptresented as a phenotype instead.) So your "ANY" case above. (@zoependlington , if you have any objection, please feel free to comment.) As for your question on source="MONDO:Entailed" source="MONDO:Redundant" I'll leave that to @nicolevasilevsky and/or @cmungall . Thanks! Paola

zoependlington commented 4 years ago

Agreed with @paolaroncaglia, we'd be interested in the "ANY" case. So no objection from me. Thank you for getting back to us!

paolaroncaglia commented 4 years ago

Addendum: I guess if the results are too numerous, we would then probably want to start working on the results subset corresponding to the "ALL" case above. The "ALL" case is the one we had in mind first, as a first step in fixing the issue; but ultimately, the "ANY" case will need to be examined too to carry out a thorough cleanup. Maybe we could re-assess the strategy when we have an idea of numbers from Shahim's results. Any opinion @cmungall?

nicolevasilevsky commented 4 years ago

@paolaroncaglia and @zoependlington Shahim put together this list of Mondo terms for us. There are over 900 terms. I'm not sure what the best approach for this is? We could discuss further on our next call.

OrdoOrphanet6.txt

paolaroncaglia commented 4 years ago

@nicolevasilevsky Thanks! I had a quick look at the first 2-3 terms in Shahim's list and, yes, I believe they need work. The best strategy probably needs discussion as it's going to be a significant amount of work. Zoë is away until Tuesday, so I'll catch up with her next week, and we can surely discuss with you on our next call on Feb 18th if not sooner. Thanks.

nicolevasilevsky commented 4 years ago

sounds good!

paolaroncaglia commented 4 years ago

Making a note that MONDO:0019170 'polyarteritis nodosa' also seems to have some questionable superclasses. Full list: a) arteritis [ok based on def] b) systemic inflammatory disease associated with an acquired peripheral neuropathy [probably not ok, unless the disease is always associated with an acquired peripheral neuropathy; otherwise that's just a symptom] c) neurovascular disease [not ok; symptom] d) secondary glomerular disease [not ok; symptom] Note that the term has exactMatch not only with ORDO but also with other resources. Thanks.

cmungall commented 4 years ago

Note that after this process we will want to find ORDO-only groupings that have no children and remove these

I recommend: switch this. prepare a set of ORDO-only grouping classes for obsoletion. obsolete all these, finding new homes for any that become orphans.

I'm thinking of combinatorial ones, e.g. 'ectodermal malformation syndrome associated with ocular features'. these are just repeating phenotypes in a grouping.

This is how I would get ORDO-onlies:

obo-grep.pl  -r 'xref: Orph' mondo-edit.obo | obo-grep.pl  --neg -r '(NCIT|OMIM|DOID)' - | obo-grep.pl  -r _group -|grep ^name:
paolaroncaglia commented 4 years ago

@cmungall @nicolevasilevsky (cc @zoependlington ) Zoe and I agree with your suggested strategy and will try to help. For us, it'd be easier to work through a shared spreadsheet than via Mondo PRs. If you could please provide the spreadsheet, Zoe and I will then look into it and make further suggestions as appropriate. To help find the biggest offenders/grouping terms/prioritize the obsoletions, could the spreadsheet please contain:

Mondo ID ORDO ID of the exact match label definition number of children number of parents Mondo ID and label(s) of parent(s) (a single field separated by pipes would work)

We could then add a column with our suggestions (e.g. remove parents x and y; place under z instead, etc.) in a formatted/useful manner to be agreed upon. Feel free to comment, and thank you!

Paola and Zoe

cmungall commented 4 years ago
Parent label Child label
MONDO:0015135 primary immunodeficiency due to a genetic defect in innate immunity MONDO:0015133 quantitative and/or qualitative congenital phagocyte defect
MONDO:0015135 primary immunodeficiency due to a genetic defect in innate immunity MONDO:0018545 primary immunodeficiency with predisposition to severe viral infection
MONDO:0015219 non-syndromic central nervous system malformation MONDO:0017104 central nervous system cystic malformation
MONDO:0015227 non-syndromic limb malformation MONDO:0017430 non-syndromic congenital joint dislocations
MONDO:0015227 non-syndromic limb malformation MONDO:0017431 non-syndromic limb overgrowth
MONDO:0015227 non-syndromic limb malformation MONDO:0019714 non-syndromic polydactyly, syndactyly and/or hyperphalangy
MONDO:0015490 predominantly small-vessel vasculitis MONDO:0015491 immune complex mediated vasculitis
MONDO:0015502 pinnae and external auditory canal anomaly MONDO:0044702 X-linked external auditory canal atresia-dilated internal auditory canal-facial dysmorphism syndrome
MONDO:0015594 non-paraneoplastic limbic encephalitis MONDO:0044683 limbic encephalitis with neurexin-3 antibodies
MONDO:0015642 benign partial infantile seizures MONDO:0015641 benign infantile focal epilepsy with midline spikes and wave during sleep
MONDO:0015652 chromosomal anomaly with epilepsy as a major feature MONDO:0044641 9q33.3q34.11 microdeletion syndrome
MONDO:0015756 myeloid hemopathy MONDO:0015688 myeloid neoplasms associated with eosinophilia and abnormality of PDGFRA, PDGFRB or FGFR1
MONDO:0015860 anomaly of puberty or/and menstrual cycle MONDO:0044660 menstrual cycle-dependent periodic fever
MONDO:0015915 cerebellar malformation MONDO:0020130 malformation of the cerebellar vermis
MONDO:0015921 ARX-related epileptic encephalopathy MONDO:0018496 ARX-related encephalopathy-brain malformation spectrum
MONDO:0015923 acquired peripheral neuropathy MONDO:0016137 acute and subacute inflammatory demyelinating polyneuropathy
MONDO:0015923 acquired peripheral neuropathy MONDO:0016172 acquired sensory ganglionopathy
MONDO:0015923 acquired peripheral neuropathy MONDO:0016178 peripheral neuropathy associated with monoclonal gammopathy
MONDO:0015923 acquired peripheral neuropathy MONDO:0016179 acquired amyloid peripheral neuropathy
MONDO:0015932 non-syndromic urogenital tract malformation of female MONDO:0015829 non-syndromic uterovaginal malformation
MONDO:0015933 non-syndromic urogenital tract malformation of male MONDO:0044644 congenital agenesis of the scrotum
MONDO:0015961 genetic head and neck malformation MONDO:0015482 otomandibular dysplasia
MONDO:0015961 genetic head and neck malformation MONDO:0018562 genetic otorhinolaryngological malformation
MONDO:0016148 qualitative or quantitative defects of collagen 6 MONDO:0016111 non-dystrophic myopathy with collagen 6 anomaly
MONDO:0016155 qualitative or quantitative defects of protein involved in O-glycosylation of alpha-dystroglycan MONDO:0016156 qualitative or quantitative defects of FKRP
MONDO:0016155 qualitative or quantitative defects of protein involved in O-glycosylation of alpha-dystroglycan MONDO:0016157 qualitative or quantitative defects of fukutin
MONDO:0016155 qualitative or quantitative defects of protein involved in O-glycosylation of alpha-dystroglycan MONDO:0016183 qualitative or quantitative defects of protein glycosyltransferase-like
MONDO:0016155 qualitative or quantitative defects of protein involved in O-glycosylation of alpha-dystroglycan MONDO:0016184 qualitative or quantitative defects of protein O-mannosyltransferase 1
MONDO:0016155 qualitative or quantitative defects of protein involved in O-glycosylation of alpha-dystroglycan MONDO:0016185 qualitative or quantitative defects of protein O-mannosyltransferase 2
MONDO:0016172 acquired sensory ganglionopathy MONDO:0016173 non-paraneoplastic sensory ganglionopathy
MONDO:0016186 qualitative or quantitative defects of myofibrillar proteins MONDO:0016187 qualitative or quantitative defects of desmin
MONDO:0016186 qualitative or quantitative defects of myofibrillar proteins MONDO:0016188 qualitative or quantitative defects of alphaB-cristallin
MONDO:0016186 qualitative or quantitative defects of myofibrillar proteins MONDO:0016189 qualitative or quantitative defects of filamin C
MONDO:0016186 qualitative or quantitative defects of myofibrillar proteins MONDO:0016190 qualitative or quantitative defects of protein ZASP
MONDO:0016221 temporomandibular joint anomaly MONDO:0018793 primary condylar hyperplasia
MONDO:0016925 partial trisomy/tetrasomy of chromosome 5 MONDO:0016942 partial trisomy/tetrasomy of the short arm of chromosome 5
MONDO:0016930 partial trisomy/tetrasomy of chromosome 9 MONDO:0016960 partial trisomy of the long arm of chromosome 9
MONDO:0016936 partial trisomy/tetrasomy of chromosome 18 MONDO:0016951 partial trisomy/tetrasomy of the short arm of chromosome 18
MONDO:0016999 X chromosome number anomaly MONDO:0017000 X chromosome number anomaly with female phenotype
MONDO:0016999 X chromosome number anomaly MONDO:0017001 X chromosome number anomaly with male phenotype
MONDO:0017000 X chromosome number anomaly with female phenotype MONDO:0017002 polysomy of X chromosome
MONDO:0017083 lipoma associated with neurospinal dysraphism MONDO:0017084 leptomyelolipoma
MONDO:0017104 central nervous system cystic malformation MONDO:0017105 glioependymal/ependymal cyst
MONDO:0017344 Epstein-Barr virus-associated carcinoma MONDO:0017348 lymphoepithelial-like carcinoma
MONDO:0017595 aggressive B-cell non-Hodgkin lymphoma MONDO:0015818 aggressive primary cutaneous B-cell lymphoma
MONDO:0017595 aggressive B-cell non-Hodgkin lymphoma MONDO:0018813 high grade B-cell lymphoma with MYC and/ or BCL2 and/or BCL6 rearrangement
MONDO:0017653 epilepsy and/or ataxia with myoclonus as major feature MONDO:0017654 non progressive epilepsy and/or ataxia with myoclonus as a major feature
MONDO:0017653 epilepsy and/or ataxia with myoclonus as major feature MONDO:0017655 progressive epilepsy and/or ataxia with myoclonus as a major feature
MONDO:0017710 congenital systemic veins anomaly MONDO:0018811 congenital portosystemic shunt
MONDO:0018230 primary bone dysplasia MONDO:0018231 primary bone dysplasia with progressive ossification of skin, skeletal muscle, fascia, tendons and ligaments
MONDO:0018230 primary bone dysplasia MONDO:0018232 primary bone dysplasia with micromelia
MONDO:0018230 primary bone dysplasia MONDO:0019694 spondylodysplastic dysplasia
MONDO:0018230 primary bone dysplasia MONDO:0019699 slender bone dysplasia
MONDO:0018230 primary bone dysplasia MONDO:0019700 primary bone dysplasia with multiple joint dislocations
MONDO:0018230 primary bone dysplasia MONDO:0019704 primary bone dysplasia with decreased bone density
MONDO:0018230 primary bone dysplasia MONDO:0019705 primary bone dysplasia with defective bone mineralization
MONDO:0018230 primary bone dysplasia MONDO:0019707 primary osteolysis
MONDO:0018230 primary bone dysplasia MONDO:0019708 primary bone dysplasia with disorganized development of skeletal components
MONDO:0018230 primary bone dysplasia MONDO:0019709 cleidocranial dysplasia and isolated cranial ossification defect
MONDO:0018230 primary bone dysplasia MONDO:0019718 lethal chondrodysplasia
MONDO:0018230 primary bone dysplasia MONDO:0028741 overgrowth or tall stature syndrome with skeletal involvement
MONDO:0018454 dysostosis of genetic origin MONDO:0019710 dysostosis with predominant craniofacial involvement
MONDO:0018454 dysostosis of genetic origin MONDO:0019711 dysostosis with predominant vertebral and costal involvement
MONDO:0018454 dysostosis of genetic origin MONDO:0019712 patellar dysostosis
MONDO:0018455 dysostosis of genetic origin with limb anomaly as a major feature MONDO:0017429 joint formation defects
MONDO:0018455 dysostosis of genetic origin with limb anomaly as a major feature MONDO:0017433 dysostosis with combined reduction defects of upper and lower limbs
MONDO:0018455 dysostosis of genetic origin with limb anomaly as a major feature MONDO:0018236 dysostosis with limb and face anomalies as a major feature
MONDO:0018455 dysostosis of genetic origin with limb anomaly as a major feature MONDO:0019714 non-syndromic polydactyly, syndactyly and/or hyperphalangy
MONDO:0018562 genetic otorhinolaryngological malformation MONDO:0015502 pinnae and external auditory canal anomaly
MONDO:0018652 biological anomaly without phenotypic characterization MONDO:0018651 lipoyl transferase 2 deficiency
MONDO:0019063 vascular anomaly MONDO:0016235 complex vascular malformation with associated anomalies
MONDO:0019213 cerebral organic aciduria MONDO:0017686 inborn aminoacylase deficiency
MONDO:0019704 primary bone dysplasia with decreased bone density MONDO:0044675 LRP5-related primary osteoporosis
MONDO:0019712 patellar dysostosis MONDO:0044641 9q33.3q34.11 microdeletion syndrome
MONDO:0020001 respiratory or thoracic malformation MONDO:0015929 thoracic malformation
MONDO:0020001 respiratory or thoracic malformation MONDO:0015930 respiratory malformation
MONDO:0020019 digestive tract malformation MONDO:0019513 esophageal malformation
MONDO:0020019 digestive tract malformation MONDO:0019998 gastroduodenal malformation
MONDO:0020019 digestive tract malformation MONDO:0019999 intestinal malformation
MONDO:0020020 visceral malformation of the liver, biliary tract, pancreas or spleen MONDO:0015213 non-syndromic visceral malformation
MONDO:0020052 partial autosomal trisomy/tetrasomy MONDO:0016925 partial trisomy/tetrasomy of chromosome 5
MONDO:0020052 partial autosomal trisomy/tetrasomy MONDO:0016930 partial trisomy/tetrasomy of chromosome 9
MONDO:0020052 partial autosomal trisomy/tetrasomy MONDO:0016936 partial trisomy/tetrasomy of chromosome 18
MONDO:0020052 partial autosomal trisomy/tetrasomy MONDO:0016972 partial duplication of the long arm of chromosome 22
MONDO:0020059 gonosome number anomaly MONDO:0016999 X chromosome number anomaly
MONDO:0020059 gonosome number anomaly MONDO:0017005 Y chromosome number anomaly
MONDO:0020133 posterior fossa malformation MONDO:0015915 cerebellar malformation
MONDO:0020138 ataxia with dementia MONDO:0020139 early-onset ataxia with dementia
MONDO:0020138 ataxia with dementia MONDO:0020140 late-onset ataxia with dementia
MONDO:0020152 rare eyelid malformation MONDO:0020155 eyelid border anomaly
MONDO:0020216 secondary dysgenetic glaucoma MONDO:0020221 secondary glaucoma due to a proliferation and differentiation anomaly
MONDO:0020217 secondary dysgenetic glaucoma associated with neural crest cell migration anomaly MONDO:0020219 corneogoniodysgenesis
MONDO:0020217 secondary dysgenetic glaucoma associated with neural crest cell migration anomaly MONDO:0020220 corneoiridogoniodysgenesis
MONDO:0020223 lens and zonula anomaly MONDO:0020235 lens size anomaly
MONDO:0020223 lens and zonula anomaly MONDO:0020237 lens shape anomaly
MONDO:0020262 nervous system anomaly with eye involvement MONDO:0020263 spinocerebellar ataxia with oculomotor anomaly
MONDO:0020262 nervous system anomaly with eye involvement MONDO:0020264 spinocerebellar degenerescence and spastic paraparesis with an oculomotor anomaly
MONDO:0020266 genodermatosis with ocular features MONDO:0016997 hereditary epidermolysis bullosa associated with ocular features
MONDO:0020266 genodermatosis with ocular features MONDO:0020271 phakomatosis with eye involvement
MONDO:0020285 transposition of the great arteries and conotruncal cardiac anomaly MONDO:0020286 aortic malformation
MONDO:0020285 transposition of the great arteries and conotruncal cardiac anomaly MONDO:0020287 pulmonary artery or pulmonary branch anomaly
MONDO:0022410 retinal ciliopathy MONDO:0022397 retinal ciliopathy due to mutation in the retinitis pigmentosa-1 gene
MONDO:0022410 retinal ciliopathy MONDO:0022399 retinal ciliopathy due to mutation in the rpgr gene
MONDO:0022410 retinal ciliopathy MONDO:0022400 retinal ciliopathy due to mutation in the rpgrip gene
MONDO:0022410 retinal ciliopathy MONDO:0022404 retinal ciliopathy due to mutation in usher gene
MONDO:0022410 retinal ciliopathy MONDO:0022405 retinal ciliopathy due to mutation in nephronophthisis gene
MONDO:0022410 retinal ciliopathy MONDO:0022407 retinal ciliopathy due to mutation in bardet-biedl gene
MONDO:0044685 autoimmune/inflammatory optic neuropathy MONDO:0044687 chronic relapsing inflammatory optic neuropathy
MONDO:0044685 autoimmune/inflammatory optic neuropathy MONDO:0044688 isolated optic neuritis
MONDO:0044685 autoimmune/inflammatory optic neuropathy MONDO:0044689 recurrent idiopathic neuroretinitis
paolaroncaglia commented 4 years ago

@cmungall Any chance you could export the table above as a Google spreadsheet, please? Ignore me if you're still working on it. Thanks.

nicolevasilevsky commented 4 years ago

I can do it! Here is a spreadsheet

paolaroncaglia commented 4 years ago

Thanks @nicolevasilevsky and @cmungall . We'll work on it over the next few days and will add comments in the spreadsheet. Will ping you when we're done and go from there.

paolaroncaglia commented 4 years ago

@cmungall I have a question on the spreadsheet already please. My understanding is that you have extracted Mondo terms that only have ORDO evidence. MONDO:0016172 acquired sensory ganglionopathy is one such term (in the spreadsheet) and has 2 children, both of which only have ORDO evidence, but only 1 of the 2 children is in the spreadsheet. Same for MONDO:0015860 anomaly of puberty or/and menstrual cycle - has several children and >1 ORDO-only, but only 1 child is in the spreadsheet. May I please make sure if I'm missing anything before I keep going? Thanks, and have a good weekend!

paolaroncaglia commented 4 years ago

@cmungall @nicolevasilevsky Sorry to disturb you again about this (do let me know if you can't address this right now). I was hoping to start work on the list/spreadsheet of ORDO terms that you prepared, but I came across an issue, see my earlier comment in this ticket. If you have a chance, could you please address my question there? Thanks! :-)

nicolevasilevsky commented 4 years ago

The question is for @cmungall, right?

Everything is okay in Portland (so far), I am more worried about you! Feel free to ping us anytime @paolaroncaglia.

paolaroncaglia commented 4 years ago

@nicolevasilevsky and @cmungall

The question is for @cmungall, right?

Right :-) Thank you for your concern and availability! All the best, Paola

cmungall commented 4 years ago

I'm investigating

This was a quick query intended just to give high profile candidates for obsoletion. There may be better ways of doing this

cmungall commented 4 years ago

perhaps @ShahimEssaid could help with a query?

paolaroncaglia commented 4 years ago

@cmungall Thank you for following up on this. May I please ask if you or @nicolevasilevsky heard back from @ShahimEssaid about possibly helping with a query? I understand if the health emergency makes it difficult to progress on this. When you can, please let me know where we are with this; because if you need more time, no problem at all, I'm happy to start looking at the list of high profile candidates for obsoletion you provided, and I'll look up their children myself via OLS or Protege. Asking just so we can avoid duplication of efforts. Thank you all, and take care.

nicolevasilevsky commented 4 years ago

I just pinged @ShahimEssaid again. I know he has a lot of deadlines right now, so you could proceed without him @paolaroncaglia that would be great. Hopefully Shahim will be able to comment here soon.

Thanks!

paolaroncaglia commented 4 years ago

@nicolevasilevsky Based on https://github.com/monarch-initiative/mondo/issues/1283#issuecomment-604033234, Zoe and I fear that any work on current Orphanet terms in Mondo may risk being "over-written" by an upcoming update of the Orphanet ingest. So the tasks discussed in recent comments in this thread may best be delayed. We'll discuss internally at EFO, and will then catch up with you at our next EFO-Mondo call. Of course, if there's any concern or urgency on either side, we can discuss sooner. Thanks!

nicolevasilevsky commented 4 years ago

Good point! It is my understanding that the new Orphanet ingest will not override any work that has currently been done though (as we've made a lot of changes already.)

I hope you don't mind putting this off. it's okay to delay on our end.

paolaroncaglia commented 4 years ago

@nicolevasilevsky

Good point! It is my understanding that the new Orphanet ingest will not override any work that has currently been done though (as we've made a lot of changes already.)

That would make sense to me too! I'll aim at looking at the list of grouping terms before our next EFO-Mondo call, so we can hopefully discuss then.

I hope you don't mind putting this off. it's okay to delay on our end.

It's a bit of a priority on our end, for Open Targets, so now that I'm reassured that the work wouldn't be overwritten, I'll try to spend some time on it. I'm just sticking to non-committal phrasing because you never know these days. Thanks, take care and speak soon!

paolaroncaglia commented 4 years ago

Making a note here that another (Orphanet-derived) problematic term is MONDO:0016575 'primary ciliary dyskinesia'. 3 of its 5 parents represent symptoms, and it has 41 children, that inherit the incorrect ancestry. Notably, 'primary ciliary dyskinesia' is not inSubset ordo_group_of_disorders. Its parent 'ciliopathy' is, but its own parents are fine, so focusing on ORDO grouping terms alone may be of limited use - need to go deeper too.

paolaroncaglia commented 4 years ago

For a quick fix while I'm looking into this, you may want to

Thanks!

paolaroncaglia commented 4 years ago

@nicolevasilevsky @cmungall Quick update on the ORDO grouping terms you highlighted in https://github.com/monarch-initiative/mondo/issues/834#issuecomment-595322698. I looked at a subset of those (8/57), and added comments in https://docs.google.com/spreadsheets/d/1KRAoGoj1O2-nm0ll5vlDMzm-9bsgMvvNza9wFIz6REA/edit#gid=656900674 (tab "Paola's summary"). The rows in red are for terms I'd suggest prioritizing for obsoletion. (Note, those 3 terms weren't in your list originally, but are their parents or ancestors.) The comment columns for those contains some general consideration and tips I hope you'll find useful. It's still a work in progress, but I wanted to write down some notes for your consideration in case I can't come back to this before our meeting on Monday. Thanks!

nicolevasilevsky commented 4 years ago

The parent 'rare pulmonary disease' is also suspicious and should be deleted.

This term is obsoleted: MONDO_0015118

nicolevasilevsky commented 3 years ago
nicolevasilevsky commented 3 years ago

closing this, duplicate with https://github.com/monarch-initiative/mondo/issues/704

@sabrinatoro and I are working on this.

nicolevasilevsky commented 3 years ago

also related to https://github.com/monarch-initiative/mondo/issues/324