EnvironmentOntology / envo

A community-driven ontology for the representation of environments
http://www.environmentontology.org
Creative Commons Zero v1.0 Universal
130 stars 52 forks source link

NTR: Australian Soil Classification #825

Open dr-shorthair opened 5 years ago

dr-shorthair commented 5 years ago

New issue to follow up topic raised in https://github.com/EnvironmentOntology/envo/issues/333#issuecomment-370286803

The Australian Soil Classification [1] has the following top-level orders:

AN Anthroposols CA Calcarosols CH Chromosols DE Dermosols FE Ferrosols HY Hydrosols KA Kandosols KU Kurosols OR Organosols PO Podosols RU Rudosols SO Sodosols TE Tenosols VE Vertosols

A turtle file containing the definitions from [1] expressed as skos:Concepts is provided [2], which includes a license from CSIRO Publishing.

[1] http://www.publish.csiro.au/book/7428

[2] ASC-orders.ttl.txt

dr-shorthair commented 4 years ago

Since soils are being discussed again #895 #894 perhaps this one could get a kick along. Here's the key information about the 'orders' in the Australian Soil Classification:

Name code definition
Anthroposols AN Soils resulting from human activities which have led to a profound modification, truncation or burial of the original soil horizons, or the creation of new soil parent materials by a variety of mechanical means. Where burial of a pre-existing soil is involved, the anthropic materials must be 0.3 m or more thick. Pedogenic features may be the result of in situ processes (usually the minimal development of an A1 horizon, sometimes the stronger development of typical soil horizons) or the result of pedogenic processes prior to modification or placement (ie. the presence of identifiable pre-existing soil material).
Calcarosols CA Soils that are calcareous throughout the solum - or calcareous at least directly below the A1 or Ap horizon, or a depth of 0.2 m (whichever is shallower). Carbonate accumulations must be judged to be pedogenic1 (either current or relict), and the soils do not have clear or abrupt textural B horizons. Hydrosols, Organosols and Vertosols are excluded.
Chromosols CH Soils other than Hydrosols with a clear or abrupt textural B horizon and in which the major part of the upper 0.2 m of the B2 horizon (or the major part of the entire B2 horizon if it is less than 0.2 m thick) is not sodic and not strongly acid. Soils with strongly subplastic upper B2 horizons are also included even if they are sodic.
Dermosols DE Soils other than Vertosols, Hydrosols, Calcarosols and Ferrosols which: (i) Have B2 horizons with structure more developed than weak1 throughout the major part of the horizon, and (ii) Do not have clear or abrupt textural B horizons.
Ferrosols FE Soils other than Vertosols, Hydrosols, and Calcarosols that: (i) Have B2 horizons in which the major part has a free iron oxide free iron oxide1 content greater than 5% Fe in the fine earth fraction (<2 mm), and (ii) Do not have clear or abrupt textural B horizons or a B2 horizon in which at least 0.3m has vertic properties.
Hydrosols HY Soils other than Organosols, Podosols and Vertosols in which the greater part of the profile is saturated for at least 2-3 months in most years.
Kandosols KA Soils other than Hydrosols which have all of the following: (i) B2 horizons in which the major part is massive or has only a weak grade of structure. (ii) A maximum clay content in some part of the B2 horizon which exceeds 15% (ie. heavy sandy loam, SL+). (iii) Do not have a tenic B horizon. (iv) Do not have clear or abrupt textural B horizons. (v) Are not calcareous throughout the solum, or below the A1 or Ap horizon or to a depth of 0.2m if the A1 horizon is only weakly developed.
Kurosols KU Soils other than Hydrosols with a clear or abrupt textural B horizon and in which the major part of the upper 0.2 m of the B2 horizon (or the major part of the entire B2 horizon if it is less than 0.2 m thick) is strongly acid.
Organosols OR Soils that are not regularly inundated by saline tidal waters and either: (i) Have more than 0.4 m of organic materials within the upper 0.8 m. The required thickness may either extend down from the surface or be taken cumulatively within the upper 0.8 m. or (ii) Have organic materials extending from the surface to a minimum depth of 0.1 m; these either directly overlie rock or other hard layers, partially weathered or decomposed rock or saprolite, or overlie fragmental material such as gravel, cobbles or stones in which the interstices are filled or partially filled with organic material. In some soils there may be layers of humose and/or melacic horizon material underlying the organic materials and overlying the substrate.
Podosols PO Soils which possess either a Bs horizon (visible dominance of iron compounds), a Bhs horizon (organic-aluminium and iron compounds), or a Bh horizon (organic-aluminium compounds). These horizons may occur singly in a profile or in combination (see Podosol diagnostic horizons).
Rudosols RU This order is designed to accommodate soils that have negligible pedologic organisation. They are usually young soils in the sense that soil forming factors have had little time to pedologically modify parent rocks or sediments. The component soils can obviously vary widely in terms of texture and depth; many are stratified and some are highly saline. Data on some of them are very limited. By definition, these soils will grade to Tenosols, so before deciding on the order check the Tenosol definition as it will often be a matter of judgement as to which order a particular soil is best placed. Hydrosols are excluded on the basis that these will normally show some pedological development eg. mottling. A particular problem with the definition and subdivision of Rudosols is the difficulty in distinguishing between soil and 'non-soil', (see also 'What do we classify?'). In many instances they are classified on the basis of the nature of AC or C horizon materials or other substrates because these are the dominating features of the profile. It also follows that there may be difficulties in deciding if a soil is best classified as an Anthroposol or be regarded as an ‘anthropic’ Rudosol because the little-altered soil parent material may be human-made.
Sodosols SO Soils with a clear or abrupt textural B horizon and in which the major part of the upper 0.2 m of the B2 horizon (or the major part of the entire B2 horizon if it is less than 0.2m thick) is sodic and not strongly acid. Hydrosols and soils with strongly subplastic upper B2 horizons are excluded.
Tenosols TE Soils that do not fit the requirements of any other soil orders and generally with one or more of the following: (i) A peaty horizon. (ii) A humose, melacic or melanic horizon, or conspicuously bleached A2 horizon, which overlies a calcrete pan, hard unweathered rock or other hard materials; or partially weathered or decomposed rock or saprolite, or unconsolidated mineral materials. (iii) A horizons which meet all the conditions for a peaty, humose, melacic or melanic horizon except the depth requirement, and directly overlie a calcrete pan, hard unweathered rock or other hard materials; or partially weathered or decomposed rock or saprolite, or unconsolidated mineral materials. (iv) A1 horizons which have more than a weak development of structure and directly overlie a calcrete pan, hard unweathered rock or other hard materials; or partially weathered or decomposed rock or saprolite, or unconsolidated mineral materials. (v) An A2 horizon which overlies a calcrete pan, hard unweathered rock or other hard materials; or partially weathered or decomposed rock or saprolite, or unconsolidated mineral materials. (vi) Either a tenic B horizon, or a B2 horizon with 15% clay (SL) or less1 , or a transitional horizon (C/B) occurring in fissures in the parent rock or saprolite which contains between 10 and 50% of B horizon material (including pedogenic carbonate). (vii) A ferric or bauxitic horizon >0.2 m thick. (viii) A calcareous horizon >0.2 m thick.with one or more of the following
Vertosols VE Soils with the following: (i) A clay field texture or 35% or more clay throughout the solum except for thin, surface crusty horizons 0.03 m or less thick and. (ii) When dry, open cracks occur at some time in most years1. These are at least 5 mm wide and extend upward to the surface or to the base of any plough layer, peaty horizon, self-mulching horizon, or thin, surface crusty horizon; and (iii) Slickensides and/or lenticular peds occur at some depth in the solum. See Comment.
dr-shorthair commented 4 years ago

I wonder if someone could help me tease these apart for axiomatization? @kaiiam

kaiiam commented 4 years ago

Hey @dr-shorthair thanks for bringing this up again. I think this is in ideal place for us to begin implementing ROBOT templates, by creating a soil template and populating it with these terms.

The proposed soil template csv file could look something like the following with columns to correspond to commonly used axioms in the current soil terms:

Ontology ID Label definition definition source subset has quality part of formed as result of derives from

@pbuttigieg @cmungall let me know what you think of this idea testing the ENVO ROBOT waters with a soil template.

cmungall commented 3 years ago

@dr-shorthair

I wonder if someone could help me tease these apart for axiomatization

I may be misunderstanding, but the textual definitions look pretty hard to turn into logical axioms. Is there really a use case for these.

(see the section "beware of over-axiomatization" in this doc -- https://docs.google.com/document/d/1nxMDJkU_eyCdBhy7-RMSLSMEHRsk3udrOKNIJPHZTQ4/edit#heading=h.pariap1j3foa -- this will become a post)

What about a more incremental approach. Get your table in with the only logical axiom being subClassOf some soil. Then we can iterate on logical axioms

@kaiiam

let me know what you think of this idea testing the ENVO ROBOT waters with a soil template

I'm not sure this is the best place to put this into practice? I think a lot (most?) of your proposed logical axiom slots would be blank for most of these types?

Sorry to be boring but I would just fire up Protege, add in these terms as subclasses of soil, add the codes as xrefs, and add the definitions, with provenance.

Logical axioms and templates will be fantastically useful for the Zobler definitions as requested in #995 https://github.com/microbiomedata/nmdc-metadata/blob/master/identify/Zobler_Definitions.txt#L20-L127

These are compositional, consisting of a qualifer (e.g. humic) and a core coil type (e.g. acrisol). I would use dosdp for compositional classes.

kaiiam commented 3 years ago

@cmungall sounds reasonable, I was (and probably still am) overly enthusiastic about ROBOT and DOSDP. Thanks for the link to the post, reading the next one I see that you are discouraging the use of OR, NOT, and ONLY statements in owl. I presume that @cmungall you would extend that logic here? E.g. for the proposed class chromosol you would advise against turning the phrase soils other than Hydrosols with ... into an axiom containing a NOT i.e., soil and not hydrosol. I guess the rational for this would be that nature tends not to exist in booleans even if here soil scientists think this to be the case at the moment. @cmungall am I understanding you correctly?

@dr-shorthair are we wrong here, are you confident about making such logical statements and would there be a need for them? Would incorporating these terms as subclasses of soil suffice for now?

For future projects I am very interested in employing DOSDPs with equivalence classes to create compositional classes for which we can achieve rector normalization and maybe "learn more" when reasoning. From my understanding of @cmungall's posts, there can be some danger in overdoing equivalence classes, but as I understand it, equivalence axioms are required to infer novel subsumption hierarchies when reasoning (is this correct?). Is it possible there is some middle ground where we could axiomatize more of our classes in a consistent way as to enable the reasoner to discover hierarchies instead of asserting them? Perhaps for classes like air which have many asserted 'has part' subclass relations? Or for a class like permafrost which has the OR heavy subclass axiom: 'composed primarily of' some (rock or soil or sediment)?

dr-shorthair commented 3 years ago

Sub-classes of Soil is what they are. So I see no harm, and some benefit, in doing the simple thing now. It would achieve my original goal which is to remind the ENVO community of the fact that there are multiple alternatives around, some of which are well described and systematized, though not axiomatized ;-)

So more axiomatization can be something we work on when we are ready (and I'm more experienced).

dr-shorthair commented 2 years ago

Here are the definitions of the orders in the Australian Soil Classification - as revised in 2021: https://www.soilscienceaustralia.org.au/asc/soilkey.htm

Note the introduction of a new Order - RE (Arenosols) - which replaces parts of the CA (Calcarosols), TE (Tenosols) and RU (Rudosols).

I mentioned this over in #333

dr-shorthair commented 1 year ago

Also available as SKOS+ for download at https://vocabs.ardc.edu.au/viewById/629

kaiiam commented 1 year ago

If I get the opportunity to work for USDA next year and they allow me to work on this I'd try to take this on then.

Otherwise others including @dr-shorthair if you want to make a pull request following the ENVO-Robot-template-and-merge-workflow it's available for people to work through and prepare terms with.

dr-shorthair commented 1 year ago

IDrange request at https://github.com/EnvironmentOntology/envo/pull/1365 Subset request at https://github.com/EnvironmentOntology/envo/issues/1366 Robot template at https://docs.google.com/spreadsheets/d/1xe_4QWEW8JwxVcz0aRp-dr6UTvjpvJ03llGIN7T5nLU

kaiiam commented 1 year ago

Thanks @dr-shorthair for preparing this very quickly! I hope the template format was helpful. This is quite a large request so I think it might take some time to review. I think there are some small changes to get things to fit with ENVO's label (lowercase) and definition conventions. I don't have a ton of free time at the moment, but I'll try to get back to this when I have time.

Anyone else please feel free to have a look at Simon's Robot template

dr-shorthair commented 1 year ago

The textual definitions are copied directly from the sources cited. I think they are essentially cast in the class/differentia form, though in some cases there are multiple differentia with an OR connector.

Apologies I overlooked the label-case convention.

dr-shorthair commented 1 year ago

It looks like a large request indeed - >100 classes. However, the structure is simple

I added a couple of pointers over to the equivalent, or more or less equivalent US orders, but that's all. (I am checking with my local experts to see if there are any more.)

I agree that the definitions are a bit sprawling, but structurally they do follow the sub-class/differentia pattern.

dr-shorthair commented 1 year ago

Ping @kaiiam - just a tickle to see if there is an ETA on this?

kaiiam commented 1 year ago

Thanks @dr-shorthair I'm between jobs at the moment, so on vacation not working. As for the terms, ENVO is pretty strict about definitions and would ask for term contributions to be formatted following our standards, see https://github.com/EnvironmentOntology/envo/wiki/Creating-good-definitions. If you can or you have anyone who can help format them in the robot template please do so. If not I can try to get to it later.

Hope that helps.

dr-shorthair commented 1 year ago

Most of the definitions do satisfy the rules, but are a bit wordy. I think I can extract and tweak the first sentence to get there.

I moved all the original defs to the comment column and made a start. Is that what is needed?

kaiiam commented 1 year ago

The original definitions are very thorough it's more a formatting thing we prefer things be very strictly in the form "A is a B which C's". Moving the original definitions to comments is great and we can extract a more concise genus differentia form. I think like you say most are close to that. We can pretty easily get it there. We can easily swap out "that" for "which" for many.

Also the label conventions in ENVO are to be lowercase e.g., arenosol (instead of Arenosol), unless it's proper noun like Taylor column. We also might need to 0 pad the IDs. e.g. change ENVO:5001000 to ENVO:05001000 or similar to follow ENVO's 8 digit convention (within the appropriate range).

kaiiam commented 1 year ago

I've fixed the labels to be lower case.

cmungall commented 1 year ago

On another issue (#1375) @dr-shorthair asked about definitions by exclusion. I think these are problematic but if this is what the authoritative source says then we try and fit them to genus-differentia form, e.g. "A G which is not a X1, X2, X3, and which Y1, ...". I don't think this should be a blocker.

cmungall commented 1 year ago

I know this issue has been open a long time but I feel we should have asked some more questions from the outset

E.g there is a very large USDA soil taxonomy guide https://www.nrcs.usda.gov/Internet/FSE_DOCUMENTS/nrcs142p2_051232.pdf

This also includes a lot of concepts for things like soil materials, horizons, that would inform the soil definitions themselves.

dr-shorthair commented 1 year ago

to what extent does the Australian system differ from other systems?

Some of the 'orders' are the same, but most are somewhat different. This is partly a result of culture and history, but is also because a different geological and environmental setting actually means that the soils are different, or at least that we need to have a more refined classification for some parts of the spectrum than what is needed in other geographic contexts.

Note that the Australian Soil Classification was revised and reissued in 2021, so this is not a stale classification.

Is the goal here to ontologize the Australian system or (as is more conventional in ontologies) to produce an abstracted representation with concepts that can subsume system-specific concepts?

My original goal was to supplement the the presentation of the ~US~ WRB system in ENVO (but see https://github.com/EnvironmentOntology/envo/issues/825#issuecomment-1286454861). With only ~US~ WRB soil orders included in ENVO, there is an implication that it is used universally, which it isn't. ~Scientific imperialism ;-)~

The USDA definitions can certainly be used to formalize the differentia, but I don't think we are there yet.

dr-shorthair commented 1 year ago

On further investigation, it looks like the ENVO soil classification largely follows the 2015 IUSS WRB system which was primarily European. So my claim about Scientific imperialism was inaccurate.

The US and ANZ systems both have a different set of top-level 'orders', and there are a number of others - https://www.fao.org/soils-portal/data-hub/soil-classification/national-systems/en/ .

wdduncan commented 1 year ago

There is a lot of info in this thread. I'm trying to get a handle of whether this a difference in labeling (i.e., lexicon) or something deeper.

For example, in the dental domain (for humans), 'wisdom tooth' and 'third molar' refer to the same type of tooth; 'Tooth 1' in the ADA tooth numbering system and 'Tooth 18' in the FDI numbering system refer to the same type of tooth. Which label you use for a tooth depends on the context in which you are operating.

Is something like this going on with soil classification?

cmungall commented 1 year ago

@wdduncan you are right there is a lot going on. I'm trying to tease things out into more actionable issues:

dr-shorthair commented 1 year ago

There is a mapping ('approximate correlation') between the ASC and some other classifications including the WRB here: https://www.soilscienceaustralia.org.au/asc/append5.htm

In an email to me, the editor of ASC Noel Schoknecht notes the following:

, there is not a 1:1 relationship between WRB or Soil Taxonomy to the ASC. WRB is closest, and appendix 5 of the ASC (ASC - Appendix 5 (soilscienceaustralia.org.au) does attempt to correlate across classifications at the Order level. Note that this needs to be updated as a new WRB has been recently published.

Also although the name may be the same (e.g. Arenosols in ASC and WRB) and the concept is similar, the definition/key diagnostic criteria may be different.

Both Ben Harms and I have tested the ASC on overseas trips, and we think it is easier to use than the other systems, although missing some key Soil Orders such as frozen soils.

Another colleague Gerard Grealish advises

• If there is mapping, the relationship is definitely not 1 to 1 and likely to be where there is most overlap • I’d be careful when comparing an order from another classification system with the Australian one – even if they use the say work, like anthrosol, calcisol, the definitions used could likely be different – ie same word but means something slightly different.

So it may be that the approach suggested by @cmungall in #1377 must be used with extreme caution.