DataONEorg / sem-prov-ontologies

Ontologies focused on scientific observations and scientific workflow provenance.
https://ontologies.dataone.org
17 stars 7 forks source link

ECSO_00001122 TIC has synonym of DIC #79

Open cgries opened 3 years ago

cgries commented 3 years ago

ECSO_00001122 'Freshwater Total Inorganic Carbon Concentration' lists the synonym of 'DIC' which would be dissolved inorganic carbon. My understanding is that they are not synonyms. DIC is determined after filtering water, while TIC contains particulate carbon. In the ECSO_00001122 record 'DIC' is repeated under altLabel and has_exact_synonym.

mbjones commented 3 years ago

@mpsaloha @samanthacsik can you clarify how terminology issues like this pointed out by @cgries will be resolved? Assuming a change is needed, how does that impact backwards compatibility, especially for documents that may have annotated against the term with its original synonymy?

mpsaloha commented 3 years ago

Hi Corinna,

Thanks for pointing out this inconsistency in ECSO!

I did find references in the literature to "Total Dissolved Inorganic Carbon", indicating that the descriptor 'Total' does not imply that Particulate components are necessarily included. Though I'd agree that in the context of an annotation to "Freshwater Total Inorganic Carbon Concentration"-- it probably does (involve such implication), since the word "Dissolved" isn't there.

I noticed that Margaret was the creator of the synonymic annotations of "DIC" to the "Freshwater Total Inorganic Carbon Concentration" measurement type in ECSO-- but in her rdfs:comment, she mentions "Total DIC". So I wonder if she was indeed referencing a specific case where the measurement was in fact a "Total DIC"? I am cc'ing Margaret here so she can weigh in.

In any case, I think we should at least correct the annotation on the term to "TIC" (as you suggest; but see below), and revise the associated rdfs:comment; and then create a new sibling Class in ECSO if Margaret's case really did involve a "Total DIC". Also we might want to consider whether assigning "TIC" as a (weak) synonym at all is too strong, since it loses the context of Freshwater, as well as Aquatic...(TIC and TOC are measured in Soil and other substrates). But these synonymic assertions are being assigned at the level of Annotation Properties, so have no logical implications-- they are just conveniences for discovery and interpretation that can be leveraged by application developers.

Relative to Matt's questions:

ECSO has been a small but ongoing project, largely carried forward by efforts from Margaret, Stephen Chong, Matt, myself, and a number of interns and fellows (currently Samantha Csik). We haven't formalized mechanisms for reporting issues, or the ontology design pattern to designate when terms are obsoleted (deprecated), revised, or replaced. It has been on my back-burner for a long while to do this. There are some useful precedents being followed in other ontology frameworks that we can learn from. And now you've given us some motivation to bring it forward as an issue.

As for what to do about external annotations ("usages") to that and other terms in ECSO that are deprecated or revised, and specifically such usages that we cannot identify from our own repositories like DataONE-- the Identifier will still exist, but the term will have extra information in its Annotation properties explaining the changes. So if someone dereferences the term they will be able to understand the changes. This is similar to what the Gene Ontology framework does; and they acknowledge that there are probably erroneous annotations to GO for a variety of reasons that are out of their control. I know there are a few additional wrinkles to consider here, but I am reminded that "perfection is the enemy of the good"

Thanks again for reporting this, Corinna! It would be fantastic if some domain experts would review ECSO for such errors; but at the least I think we can provide a better framework for reporting these.

cheers, Mark

On Wed, Mar 3, 2021 at 2:54 PM Matt Jones notifications@github.com wrote:

@mpsaloha https://github.com/mpsaloha @samanthacsik https://github.com/samanthacsik can you clarify how terminology issues like this pointed out by @cgries https://github.com/cgries will be resolved? Assuming a change is needed, how does that impact backwards compatibility, especially for documents that may have annotated against the term with its original synonymy?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DataONEorg/sem-prov-ontologies/issues/79#issuecomment-790130596, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHLL6I5C7BBLOHU7OUXGF3TB24ZNANCNFSM4YSDSDHQ .

mobb commented 3 years ago

I'll dig into some old notes. I think that synonym is an error for the class as labeled. However, it might be just the labels, and the hierarchy is fine. This part of ECSO needs work; e.g.,- there is also a second class labeled DIC: odo:ECSO_00000632 that Mark put in.

Definitely we need the particular inorganic fractions represented too (e.g., for precipitates and coccoliths in seawater); I don't see those.

Re status of ECSO: I agree with Mark. It's a work in progress, and not yet released (although starting to be used).

Margaret

Margaret O'Brien ORCID: 0000-0002-1693-8322 Information Management Marine Science Institute, UCSB Santa Barbara, CA 93106 805-893-2071 (voice) http://environmentaldatainitiative.org http://sbc.marinebon.org http://sbc.lternet.edu

On Thu, Mar 4, 2021 at 4:13 PM Mark notifications@github.com wrote:

Hi Corinna,

Thanks for pointing out this inconsistency in ECSO!

I did find references in the literature to "Total Dissolved Inorganic Carbon", indicating that the descriptor 'Total' does not imply that Particulate components are necessarily included. Though I'd agree that in the context of an annotation to "Freshwater Total Inorganic Carbon Concentration"-- it probably does (involve such implication), since the word "Dissolved" isn't there.

I noticed that Margaret was the creator of the synonymic annotations of "DIC" to the "Freshwater Total Inorganic Carbon Concentration" measurement type in ECSO-- but in her rdfs:comment, she mentions "Total DIC". So I wonder if she was indeed referencing a specific case where the measurement was in fact a "Total DIC"? I am cc'ing Margaret here so she can weigh in.

In any case, I think we should at least correct the annotation on the term to "TIC" (as you suggest; but see below), and revise the associated rdfs:comment; and then create a new sibling Class in ECSO if Margaret's case really did involve a "Total DIC". Also we might want to consider whether assigning "TIC" as a (weak) synonym at all is too strong, since it loses the context of Freshwater, as well as Aquatic...(TIC and TOC are measured in Soil and other substrates). But these synonymic assertions are being assigned at the level of Annotation Properties, so have no logical implications-- they are just conveniences for discovery and interpretation that can be leveraged by application developers.

Relative to Matt's questions:

ECSO has been a small but ongoing project, largely carried forward by efforts from Margaret, Stephen Chong, Matt, myself, and a number of interns and fellows (currently Samantha Csik). We haven't formalized mechanisms for reporting issues, or the ontology design pattern to designate when terms are obsoleted (deprecated), revised, or replaced. It has been on my back-burner for a long while to do this. There are some useful precedents being followed in other ontology frameworks that we can learn from. And now you've given us some motivation to bring it forward as an issue.

As for what to do about external annotations ("usages") to that and other terms in ECSO that are deprecated or revised, and specifically such usages that we cannot identify from our own repositories like DataONE-- the Identifier will still exist, but the term will have extra information in its Annotation properties explaining the changes. So if someone dereferences the term they will be able to understand the changes. This is similar to what the Gene Ontology framework does; and they acknowledge that there are probably erroneous annotations to GO for a variety of reasons that are out of their control. I know there are a few additional wrinkles to consider here, but I am reminded that "perfection is the enemy of the good"

Thanks again for reporting this, Corinna! It would be fantastic if some domain experts would review ECSO for such errors; but at the least I think we can provide a better framework for reporting these.

cheers, Mark

On Wed, Mar 3, 2021 at 2:54 PM Matt Jones notifications@github.com wrote:

@mpsaloha https://github.com/mpsaloha @samanthacsik https://github.com/samanthacsik can you clarify how terminology issues like this pointed out by @cgries https://github.com/cgries will be resolved? Assuming a change is needed, how does that impact backwards compatibility, especially for documents that may have annotated against the term with its original synonymy?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/DataONEorg/sem-prov-ontologies/issues/79#issuecomment-790130596 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABHLL6I5C7BBLOHU7OUXGF3TB24ZNANCNFSM4YSDSDHQ

.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/DataONEorg/sem-prov-ontologies/issues/79#issuecomment-791045671, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFUDCLT45KWNOBI6R7ZACTTCAO3NANCNFSM4YSDSDHQ .