Open sabrinatoro opened 4 months ago
@twhetzel this should be prioritised right after ICD11 and MedGen probably. I lost a bit track of Joes priorities now, so I leave it to you to fold this into the schedule?
Can you provide us with a simple exclude list for parents? "too high level"
Currently, the "too high level" term that is consistently reported is MONDO:0000001 =disease = the highest term in the ontology (parent of "human disease" and "non-human animal disease").
Can you provide a short statement on why "too high parents" are confusing?
"disease" as a parent, is not specific enough to be useful. Every term should at least be either in the "human disease" branch or in the "non-human animal disease" branch. From a curation perspective, if a term has the parent "disease" or even "human disease", we will have to review this term and find a more specific parent, minimally one of the "high-level classification" term for human diseases (see list below)
Mondo ID | term name |
---|---|
MONDO:0002409 | auditory system disorder' |
MONDO:0002657 | breast disorder' |
MONDO:0045024 | cancer or benign tumor' |
MONDO:0004995 | cardiovascular disorder' |
MONDO:0019040 | chromosomal disorder' |
MONDO:0003900 | connective tissue disorder |
MONDO:0004335 | digestive system disorder' |
MONDO:0021147 | disorder of development or morphogenesis' |
MONDO:0002022 | disorder of orbital region' |
MONDO:0024458 | disorder of visual system' |
MONDO:0005151 | endocrine system disorder' |
MONDO:0005570 | hematologic disorder' |
MONDO:0003847 | hereditary disease' |
MONDO:0043543 | iatrogenic disease' |
MONDO:0700007 | idiopathic disease' |
MONDO:0005046 | immune system disorder' |
MONDO:0005550 | infectious disease' |
MONDO:0021166 | inflammatory disease' |
MONDO:0002051 | integumentary system disorder' |
MONDO:0005066 | metabolic disease' |
MONDO:0044970 | mitochondrial disease' |
MONDO:0006858 | mouth disorder' |
MONDO:0002081 | musculoskeletal system disorder' |
MONDO:0005071 | nervous system disorder' |
MONDO:0005137 | nutritional disorder' |
MONDO:0700003 | obstetric disorder' |
MONDO:0100366 | occupational disorder' |
MONDO:0024623 | otorhinolaryngologic disease' |
MONDO:0100086 | perinatal disease' |
MONDO:0029000 | poisoning |
MONDO:0021669 | post-infectious disorder' |
MONDO:0002025 | psychiatric disorder' |
MONDO:0043459 | radiation-induced disorder' |
MONDO:0005039 | reproductive system disorder' |
MONDO:0005087 | respiratory system disorder' |
MONDO:0002254 | syndromic disease' |
MONDO:0043839 | ulcer disease' |
MONDO:0044991 | upper digestive tract disorder' |
MONDO:0002118 | urinary system disorder' |
From a curation perspective, if a term has the parent "disease" or even "human disease", we will have to review this term and find a more specific parent, minimally one of the "high-level classification" term for human diseases (see list below)
I think our SOP should really include a moment of pause here (this is exactly why I was asking). I personally hoped the "parent" was mere a suggestion and is always carefully reviewed during migration. This is why I was not originally worried to include very high level parents - because I knew someone was looking at them anyways and throw them out..
I think our SOP should really include a moment of pause here (this is exactly why I was asking). I personally hoped the "parent" was mere a suggestion and is always carefully reviewed during migration. This is why I was not originally worried to include very high level parents - because I knew someone was looking at them anyways and throw them out..
I see where you come from, and I want to reassure you that a curator reviews the list of suggested parents before creating the new terms. In some cases, 5 parents are suggested (IDs separated by a pipe), so it is a lot of copy/paste and manual removal. It is easy to recognize MONDO:0000001, and remove it from parents, so it is not a big issue. But since we will never add it as a parent, it is not useful to have it reported (but again, it might not be worth the technical work to exclude it from the parent list).
This makes sense now, thank you @sabrinatoro :)
for example: slurp/ordo.tsv file. The parents reported for new terms includes
These obsolete or too high-level terms should not be brought in as parents Example from slurp/ordo.tsv
MONDO:0958100 | autoinflammatory syndrome with acne and/or hidradenitis suppurativa | Orphanet:653434 | MONDO:equivalentTo | Autoinflammatory syndrome with acne and/or hidradenitis suppurativa | | | MONDO:8000033|MONDO:0017954|MONDO:0017370
Note: I understand that it might be problematic if only terms that have parents in Mondo are brought in. If it is the case, these terms can be brought in, but parents should not be included in the spreadsheet.