EnvironmentOntology / envo

A community-driven ontology for the representation of environments
http://www.environmentontology.org
Creative Commons Zero v1.0 Universal
132 stars 51 forks source link

Improve definitions of existing subclasses of soil #1375

Open cmungall opened 1 year ago

cmungall commented 1 year ago

I am working on OAK text definition validation (https://github.com/INCATools/ontology-access-kit/releases/tag/v0.1.46) I would like to try this on the soil defs in ENVO.

I chose this as definitions seem to be a blocker for #825 yet our existing definitions are not good.

Looking at existing subclasses of soil:

1. Term is included

e.g.

ENVO:00005762 ! chloropicrin enriched soil "A portion of chloropicrin enriched soil is a portion of soil with elevated levels of chloropicrin." ENVO:00005789 ! bluegrass field soil "Bluegrass field soil is a soil which is found in a field of Kentucky Bluegrass (Poa pratensis)."

Including the definiendum is bad practice

2. Portion as genus

ENVO:00005777 ! steppe soil "A portion of soil which is found in a steppe."

This violates S11 and the ENVO guidelines, since the parent is not "portion of soil"

3 Soil as genus

ENVO:00005772 ! orchard soil "Soil in which trees from an orchard grow." ENVO:00005765 ! frozen compost soil "Compost soil which is frozen." ENVO:00002145 ! chromate contaminated soil "Soil which has elevated concentrations of chromate."

I vote for this as the standard form

Note technically it violates https://github.com/EnvironmentOntology/envo/wiki/Creating-good-definitions as we need to include an exception clause such that the copula is omitted for mass nounds

Plural genus

ENVO:00002234 ! acrisol "Acrisols are soils that have a higher clay content in the subsoil than in the topsoil as a result of pedogenetic processes (especially clay migration) leading to an argic subsoil horizon. Acrisols have in certain depths a low base saturation and low-activity clays."

I vote against this, even though this is how most authoritative sources may list it

5 other issues

ENVO:00005749 ! farm soil "A portion of soil which is part of a cropland or a rangeland biome."

labels should match definitions

dr-shorthair commented 1 year ago

In #825 proposal https://docs.google.com/spreadsheets/d/1xe_4QWEW8JwxVcz0aRp-dr6UTvjpvJ03llGIN7T5nLU I have tried to extract the essential definition from the more wordy original (which I moved to comment). Here's a few examples:

Order definition
Anthroposol Soil which results from human activities
Fusic Anthroposol Anthroposol which has a surface layer at least 0.3m thick that shows evidence of burnt peat and comprises ≥20% of fusic soil material.
Cumulic Anthroposol Anthroposol which has at least 0.3m m of human-deposited materials or the accumulation of shells and organic materials to form middens.
Hortic Anthroposol Anthroposol which has had additions of organic residues that have been incorporated into the soil and obliterated pre-existing pedological features.
Arenosol Soils which have within the upper 1.0 m of the soil profile: 1. A sandy field texture; and 2. No layer or horizon with a clay content that exceeds 15% ; and 3. ≤10% of coarse fragments and/or hard segregations >2 mm in size; and 4. No hard layers

However, some of the original definitions start by excluding some existing classes. For example - row 71 Dermosol is "Soils other than Vertosols, Hydrosols, Calcarosols and Ferrosols which: ..." How does this get written in the definition?

wdduncan commented 1 year ago

I like the approach of using OAK to check the textual definitions. Should we include it as part of the release process?