Closed nicolevasilevsky closed 4 years ago
@nicolevasilevsky Please also see my comment here https://github.com/monarch-initiative/mondo/issues/1216#issuecomment-608283463 Thanks.
@shahim agrees with this
@nicolevasilevsky - this should have been inferred by reasoning. While it's never wrong to assert a correct is-a with evidence, if this was not showing up in Protege under coronavinae infectious disease there is a problem.
Note also the logical definition for SARS/MONDO:0005091 uses http://purl.obolibrary.org/obo/NCBITaxon_694009
however in NCBITaxon, this is the superclass of SARS-CoV-2!
this will mean that SARS will be a superclass of COVID-19.
This is not what we want as SARS is actually a sibling of COVID-19, caused by SARS-CoV-1. This does not appear to be in NCBITaxon
This is an example of the prototype problem. We see this a lot with genetic diseases. But for infectious diseases and taxonomy too. At t0, we have a disease FOO. At t1, we have a new similar disease called FOO-2. The string "FOO" then becomes ambiguous. Does it mean the superclass of the original disease and FOO-2? Or is it the sibling of the original disease, which may sometimes go by FOO-1?
We need to be very careful here. Opening for further review
What's the correct nomenclature? Coronavinae or Coronavirinae? our labels differs from ncbi
id: MONDO:0005719
name: coronavinae infectious disease
def: "Virus diseases caused by the coronavirus genus. Some specifics include transmissible enteritis of turkeys (enteritis, transmissible, of turkeys); feline infectious peritonitis; and transmissible gastroenteritis of swine (gastroenteritis, transmissible, of swine)." [MESH:D018352]
the definition is not wrong at all... however, I think there are some more notable specifics to call out!
I think I classified this under the wrong parent, I think it should be Coronaviridae, see: https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=2697049
OK, I am working on a branch now, hold off on this
But yes, according to NCBITaxon:
Really the nomenclature and logical defs should be dp template driven!!
I'm also adding
id: MONDO:0020753
name: Orthocoronavirinae infectious disease
def: "Infectious disease causes by viruses in the subfamily Orthocoronavirinae (coronaviruses). In humans, coronaviruses cause respiratory tract infections that can be mild, such as some cases of the common cold (among other possible causes, predominantly rhinoviruses), and others that can be lethal, such as SARS, MERS, and COVID-19." [MONDO:cjm, https://en.wikipedia.org/wiki/Coronavirus, https://github.com/monarch-initiative/mondo/issues/1355]
synonym: "coronavirus infectious disease" EXACT [https://en.wikipedia.org/wiki/Coronavirus]
is_a: MONDO:0005108 {source="DOID:0080599", source="MONDO:redundant"} ! viral infectious disease
is_a: MONDO:0005718 {source="NCBITaxon:2501931"} ! Coronaviridae infectious disease
xref: DOID:0080599 {source="MONDO:equivalentTo"} ! coronavirus infection
intersection_of: MONDO:0005550 ! infectious disease
intersection_of: realized_in_response_to_stimulus NCBITaxon:2501931 ! Orthocoronavirinae
is the syn correct?
I think that synonym works. There isn't a synonym in NCBI, but if I search on 'coronavirus', it brings up the three subclasses of Orthocoronavirinae (alpha-, beta- and gamma-coronavirus). It maybe not be the official synonym though.
Note also the logical definition for SARS/MONDO:0005091 uses http://purl.obolibrary.org/obo/NCBITaxon_694009
however in NCBITaxon, this is the superclass of SARS-CoV-2!
this will mean that SARS will be a superclass of COVID-19.
This is not what we want as SARS is actually a sibling of COVID-19, caused by SARS-CoV-1. This does not appear to be in NCBITaxon
This is an example of the prototype problem. We see this a lot with genetic diseases. But for infectious diseases and taxonomy too. At t0, we have a disease FOO. At t1, we have a new similar disease called FOO-2. The string "FOO" then becomes ambiguous. Does it mean the superclass of the original disease and FOO-2? Or is it the sibling of the original disease, which may sometimes go by FOO-1?
We need to be very careful here. Opening for further review
I absolutely agree! We can't say "sars1" is the specific 2002/3 strain/disease AND the parent species for all SARS-related strains one of which is "sars2" of the 2019 pandemic strain/disease. We are missing a "type strain" for sars1... (I used to work with eukaryotic taxonomies and there's never a taxon without a type...)
Been banging my head against a wall with this for weeks... keep coming back to it.
Same story for MERS, btw.
I sent an email to "Biocurators" and @paolaroncaglia pointed me to this ticket. Happy to help resolve it - but I have no power over the authoritative DBs...
Btw: this page is useful for classification history: https://talk.ictvonline.org/taxonomy/p/taxonomy-history?taxnode_id=20040588&src=NCBI&ictv_id=20040588
Same here - following UniProt Reactome is using NCBI 694009 as the ID for SARS-CoV-1 RNA and proteins, and NCBI 2697049 as the ID for SARS-CoV-2, even though the NCBI taxonomy appears to make 2697049 a child of 694009, not a sibling or a cousin.
But there is also the rationalization that we're annotating functions of gene products, and SARS-CoV-1 and SARS-CoV2 are undoubtedly different taxa even if the exact formal nature of the difference is still under discussion (guessing from the ICTV history web page Birgit pointed to).
In case it's useful, the issue may be raised with NCBI Taxonomy by contacting the NCBI Help Desk, or even the GenBank Help (same page) who have a direct channel with the Taxonomy team. (Addendum: I see that a more direct contact info has now been provided on the ISB mailing list.)
I got in touch with them yesterday, discussions ensue...
Closing this. Please reopen if needed
Mondo term (ID and Label) 'severe acute respiratory syndrome'
Suggested revision and reasons should be a child of MONDO_0005719 'coronavinae infectious disease'
related to #1341