EnvironmentOntology / envo

A community-driven ontology for the representation of environments
http://www.environmentontology.org
Creative Commons Zero v1.0 Universal
130 stars 52 forks source link

NTRs related to MIxS sediment checklist #897

Open simroux opened 4 years ago

simroux commented 4 years ago

Based on work done in the NMDC Ontology workshop. A substantial part of these terms would also apply to the MIxS soil checklist which @ukaraoz is curating. To order the different terms, we looked at which fields were most often filled in in submission to ENA (thanks to data provided by @josieburgin) for the soil and sediment checklist. The list below includes is ordered from the most frequently to least frequently used terms.

MIxS term MIxS Values MIxS description ontology term label ontology ontology ID ontology purl notes
depth key Depth is defined as the vertical distance below local surface, e.g. For sediment or soil samples depth is measured from sediment or soil surface, respectively. Depth can be reported as an interval for subsurface samples depth in soil   to be created    
salinity key Salinity is the total concentration of all dissolved salts in a water sample. While salinity can be measured by a complete chemical analysis, this method is difficult and time consuming. More often, it is instead derived from the conductivity measurement. This is known as practical salinity. These derivations compare the specific conductance of the sample to a salinity standard such as seawater     to be created    
methane key Methane (gas) amount or concentration at the time of sampling     to be created    
tidal_stage key Stage of tide     to be created   Should be an ENVO process ?
tidal_stage low tide       to be created    
tidal_stage ebb tide       to be created    
tidal_stage flood tide       to be created    
tidal_stage high tide       to be created    
tot_org_carb key Definition for soil: total organic carbon content of the soil, definition otherwise: total organic carbon content     to create   Similar to ENVO_09000008, "concentration of carbon atom in soil". We need to add all of these composite terms to ENVO: DOC = dissolved organic carbon (in soil, in water), DIC = dissolved inorganic carbon (in soil, in water), TOC =total organic carbon (in soil, in water), TIC = total inorganic carbon (in soil, in water), TC = total carbon (in soil, in water)
pH key Ph measurement of the sample, or liquid portion of sample, or aqueous phase of the fluid   CHMO CHMO:0002354 http://purl.obolibrary.org/obo/CHMO_0002354 Should be added to ENVO ?
oxy_stat_samp key Oxygenation status of sample     to be created    
oxy_stat_samp aerobic       to be created   add under ENVO_00002007?
oxy_stat_samp anaerobic   anaerobic sediment ENVO ENVO:00002045 http://purl.obolibrary.org/obo/ENVO_00002045  
oxy_stat_samp other       to be created    
nitro key Concentration of nitrogen (total)     to be created    
sulfate key Concentration of sulfate in the sample     to be created   Should be added to ENVO under PATO:0000033 ?
carb_nitro_ratio key Ratio of amount or concentrations of carbon to nitrogen     to be created    
redox_potential key Redox potential, measured relative to a hydrogen cell, indicating oxidation or reduction potential     to be created    
nitrite key Concentration of nitrite in the sample     to be created   Should be added to ENVO under PATO:0000033 ?
phosphate key Concentration of phosphate     to be created   Should be added to ENVO under PATO:0000033 ?
pressure key Pressure to which the sample is subject to, in atmospheres     to be created   Should be added to ENVO under PATO:0000033 ?
potassium key Concentration of potassium in the sample     to be created    
magnesium key Concentration of magnesium in the sample     to be created    
calcium key Concentration of calcium in the sample     to be created    
perturbation key Type of perturbation, e.g. chemical administration, physical disturbance, etc., coupled with perturbation regimen including how many times the perturbation was repeated, how long each perturbation lasted, and the start and end time of the entire perturbation period; can include multiple perturbation types exposure stressor EXO EXO:0000000 http://purl.obolibrary.org/obo/ExO_0000000 Should be added to ENVO ?
water content key Water content measurement soil water content EO EO:0007259 http://purl.obolibrary.org/obo/EO_0007259 Should be added to ENVO ?
silicate key Concentration of silicate     to be created   Should be added to ENVO under PATO:0000033
sodium key Sodium concentration in the sample     to be created   Should be added to ENVO under PATO:0000033
sulfide key Concentration of sulfide in the sample     to be created   Should be added to ENVO under PATO:0000033
turbidity key Measure of the amount of cloudiness or haziness in water caused by individual particles     to be created   Add under PATO:0001018
org_carb key Concentration of organic carbon     to be created    
org_matter key Concentration of organic matter     to be created    
org_nitro key Concentration of organic nitrogen     to be created    
diss_org_carb key Concentration of dissolved organic carbon in the sample, liquid portion of the sample, or aqueous phase of the fluid     to be created    
diss_org_nitro key Dissolved organic nitrogen concentration measured as; total dissolved nitrogen - NH4 - NO3 - NO2     to be created    
diss_oxygen key Concentration of dissolved oxygen     to be created    
pbuttigieg commented 4 years ago

Thanks! We also should use this as an opportunity to spot dangerous semantic variation of the same label across packages.

pbuttigieg commented 4 years ago

@simroux when we create the ontology terms to represent these, I suppose they will be sediment specific right? So "depth" is actually "depth in the sediment"? The table above suggests this is "depth in soil" which isn't the same thing (i.e. wouldn't be the same ontology class).

This also raises an issue about domain conflict that @ramonawalls can perhaps help resolve with BCO:

ENVO can deal with properties of environmental entities. "Depth of sediment" is fine as are things like "nitrate concentration in sediment". However, what seems to be needed in the "depth" case (ignoring the soil/sediment conflict) is "depth of sampling event in a sediment column" which is more a BCO field. @ramonawalls thoughts?

simroux commented 4 years ago

@pbuttigieg : These are two good questions. For question #1: I would intuitively say sediments are a subset of soils, however this is not what is currently reflected in ENVO, where soil and sediment are two distinct classes at the same rank (under "Environmental material"), and there's certainly some very good reasons + experts opinion for this. So then yes, while I initially tried to re-use terms associated with soil (e.g. "depth in soil"), these should be adapted to sediment (i.e. "depth in the sediment").

For BCO vs ENVO, I agree that the "depth" will most often be associated with a sample or measurement (I could imagine some in situ probe measurements that would need to be associated with a "depth in sediment" information). So we may need both BCO and ENVO ??

pbuttigieg commented 4 years ago

Building on the above - many of these seem to be package agnostic:

"concentration of magnesium" inheres in any sampled material, not just sediment.

Two routes present themselves:

Option 1

We use BCO with its notion of "sample" to compose things like "concentration of magnesium in a sample of environmental material" axiomatised similar to

'concentration of'
 and ('inheres in' some 
    (magnesium
     and ('part of' some (BCO:sample and 'composed primarily of' some 'environmental material'))))

Where the ENVO medium/material field in the MIxS core checklist will be parsed to specify the environmental material.

Option 2

For each package, we pre-compose specific IRIs for each property described by each parameter. When we RDFise MIxS, this will mean that each field in each package will have its own IRI.

Thoughts @ramonawalls @cmungall ?

pbuttigieg commented 4 years ago

So then yes, while I initially tried to re-use terms associated with soil (e.g. "depth in soil"), these should be adapted to sediment (i.e. "depth in the sediment").

Thanks for the clarification @simroux - this is what I meant above by some of the 'dangerous semantic variation' in MIxS. This wasn't a big issue in the past, but now when we're trying to get more organised and precise, we'll need to clean this up.

For BCO vs ENVO, I agree that the "depth" will most often be associated with a sample or measurement (I could imagine some in situ probe measurements that would need to be associated with a "depth in sediment" information). So we may need both BCO and ENVO ??

It's likely we'll need a combination of ontologies to handle the semantics implicit in several MIxS terms, but don't worry too much - this is normal and healthy in the OBO world as we don't want to try to build one ontology that covers everything. OBO ontologies interoperate and can be spliced together as needed.

@cmungall @ramonawalls An OBO application ontology for MIxS may not be a bad idea here, so we get the IDs and RDF we need to take MIxS forward in an interoperable way.

simroux commented 4 years ago

Cleaning up this type of issues would be great :-) From my outsider perspective, option 1 seems the best as, once this would be in place, you could expand it to new compounds / measurements or new environments relatively "easily", but I've no idea how more complex this would be in terms of implementation.

pbuttigieg commented 4 years ago

From my outsider perspective, option 1 seems the best as, once this would be in place, you could expand it to new compounds / measurements or new environments relatively "easily", but I've no idea how more complex this would be in terms of implementation.

I'm leaning that way too - it's a good segue into more efficient and structured use of reference ontologies that saves the need for massive inflation.

A potential issue with my draft approach for option 1 is that there can be multiple material terms in the 3rd slot.

cmungall commented 4 years ago

I don't think we need to bring in BCO classes here

dr-shorthair commented 4 years ago

Hmm. Not all soils are sediments in the classical sense. Organic soils in particular. 'Sediment' has a genetic feel about it (related to sedimentation?) while soils are formed through a variety of processes, some involving transport, but others not.

Somewhat related is this long-outstanding NTR: https://github.com/EnvironmentOntology/envo/issues/825 the narrative for which now includes the textual definitions of the Australian soil orders.