EnvironmentOntology / envo

A community-driven ontology for the representation of environments
http://www.environmentontology.org
Creative Commons Zero v1.0 Universal
136 stars 53 forks source link

Proposed ENVO simple layer #1416

Open cmungall opened 1 year ago

cmungall commented 1 year ago

As part of the NEON/NMDC collaboration we have been mapping NEON terms to ENVO. The team needed a concept "deciduous forest", and the closest they could find was "area of deciduous forest", so this was used.

However, this leaves us in an odd position. If we look at the placement of "area of deciduous forest" in comparison to other more specific "deciduous forest" terms:

image

We see that they are in completely separate branches. We would like all terms we use to be in a single is-a/part-of hierarchy, such that we can do standard ontology operations such as provide faceted browsing, roll up terms for analyses, etc.

The two branches here also surfaces a number of other unusual decisions.

In tropical, we have "tropical {...} broadleaf forest IS_A tropical broadleaf forest":

image

But this is inconsistent with "temperate {...} broadleaf forest PART_OF temperate broadleaf forest":

image

It would be useful if we could somehow tease apart the core concepts in ENVO independent of whether something is considered an area, an ecosystem, a biome, or an astronomical body part. For annotating sample sources and many other applications of users would appreciate something that looks more like anatomy ontologies, GO, etc, with less duplication of concepts, where the core concepts are in one hierarchy and follow standard classification patterns.

In this classification we would have a general concept like "forest", and then we would have a fairly consistent lattice of sub-types (no part-ofs) for the various ways a forest can be classified: tropical vs temperate, needle vs broad leaf, deciduous vs non-deciduous...

We would reserve parthood for things that are clearly composition - a forest is made of trees, a canopy is part of a forest, the forest is part of the terrestrial composition of the earth, etc

Then for groups that need to distinguish area vs ecosystem vs biome vs feature, these could be added as separate branches; however, the simple core could still be extracted and it would form a coherent consistent core.

wdduncan commented 1 year ago

I notice that ENVO has a forest ecosystem term. Should deciduous forest be a subclass of that?

On a more general note, I agree with you that we need design patterns that avoid defining things as immaterial entity.

cmungall commented 1 year ago

@wdduncan thanks for your comments!

notice that ENVO has a forest ecosystem term. Should deciduous forest be a subclass of that?

The essence of my proposal is that we would like to see a class hierarchy that organized something like this

We are neutral as to whether these are conceived of as ecosystems, biomes, whatever. This would also work for us

Note that for brevity I am only showing one path here but this could potentially be a lattice.

The main point of my proposal is that the concepts annotators need should be in one hierarchy, rather than the concept of "decidous forest" being in one hierarchy, "temperate deciduous needleaf forest ecosystem" being in another, etc.

On a more general note, I agree with you that we need design patterns that avoid defining things as immaterial entity.

To be clear, I am not proposing that we avoid adding immaterial entities. I would prefer to see the use cases for these clearly articulated, but I trust these use cases exist.

We just need to capture the base concepts in a single hierarchy.

Right now, an annotator who is looking to annotate a "deciduous forest" concept is likely to pick "area of deciduous forest" (this has indeed happened for NEON annotation). This means that our annotations are a pick and mix of different branches, and roll-up queries don't work as expected - e.g. annotations to "deciduous forest" do not roll up to "forest ecosystem", and annotations to "temperate deciduous needleaf forest ecosystem" do not roll up to "deciduous forest".

wdduncan commented 1 year ago

@cmungall I agree with your reasoning about the advantages of a single hierarchy.

To be clear, I am not proposing that we avoid adding immaterial entities. I would prefer to see the use cases for these clearly articulated, but I trust these use cases exist.

I didn't think you were. This is my opining ;)

Perhaps my use of 'avoid' sounded to strong. Here is all I meant.

It seems to me that immaterial entities (and ICEs) generate a number of these shadow issues. Consider (for example) how we would represent the northern hemisphere or Eastern Europe. Such entities invite the use immaterial entity (especially amongst the more philosophically/BFO minded developers .... I think I may still be included in this group). This, in turn, increases the chance of creating shadow hierarchies.

It would be better (again my opinion) if the default stance was to classify things as material entities, and use immaterial entity only when a clear and motivating case can be made.

kaiiam commented 1 year ago

After a long querying and manual filtering exercise of terms from the environmental system, environmental zone, layer, and astronomical body part hierarchies, filterning it into a around 1164 terms in ~100 categories. I found the following. Shown below are the 28 most problematic concepts which are only or mostly found in some combination of biome, layer, environmental zone, ecosystem, (or others), but not in the astronomical body part hierarchy, which is the closest current hierarchy to the single branch with the bulk of terms. I think cleaning up all terms associated with these concepts would help with the vision of a single hierarchy for curators.

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

ENVO Parent | Category -- | -- ecosystem | bog layer | canopy geographic feature and ABP | cut ecosystem | ecotone Various | estuarine ecosystem | farm layer | floor Various | forest layer | front Various | grassland Various | marine ecosystem | marsh ecosystem | meadow ecosystem | mire various geographic feature, biome and layer | neritic environmental zone | oasis various ecosystem and ABP | palsa Various: biome and layer | pelagic ecosystem | plantation environmental zone | plate environmental zone | scrubland biome | shrubland Various | swamp Various | tidal/tide Various | tundra Various: layer, environmental zone | vegetation Various | wetland Various | woodland

cmungall commented 1 year ago

OAK report here:

envo-ldef-rpt.tsv.txt

I have exerted a subset of it here:

id label environment ecosystem biome area num_concepts
ENVO:01001357 desert ENVO:01001780 *ENVO:01000179/GEN +ENVO:00000097/DF 4
None freshwater ENVO:01000306 ENVO:01001789 ENVO:00000873 3
None aquatic ENVO:01000317 ENVO:01001787 ENVO:00002030 3
ENVO:01000206 temperate ENVO:01001705 ENVO:01001831 3
None marine ENVO:01000320 ENVO:01001788 ENVO:00000447 3
ENVO:01000205 subtropical ENVO:01001702 ENVO:01001832 3
ENVO:01000204 tropical ENVO:01001701 ENVO:01001830 3
ENVO:01000238 polar ENVO:01001703 ENVO:01000339 3
ENVO:01000251 subpolar ENVO:01001704 ENVO:01001834 3
None alpine tundra ENVO:01001371 ENVO:01001505 ENVO:03400001 3
None grassland ENVO:01001206 ENVO:01000177 ENVO:00000106 3
None cropland ENVO:01001244 ENVO:01000245 ENVO:01000892 3
None woodland ENVO:01001245 ENVO:01000175 ENVO:00000109 3
None tundra ENVO:01001370 ENVO:01000180 ENVO:00000112 3
ENVO:01000431 mixed forest *ENVO:01000198/GEN ENVO:01000855 3
ENVO:00002010 saline water ENVO:01000307 2
ENVO:00002149 sea water ENVO:01000321 2
ENVO:00002019 brackish water ENVO:01000322 2
ENVO:00005791 sterile water ENVO:01001042 2
ENVO:00002012 hypersaline water ENVO:01001043 2
ENVO:00001998 soil ENVO:01001044 2
ENVO:00002007 sediment ENVO:01001048 2
ENVO:00010505 aerosol ENVO:01001052 2
UBERON:0015474 axilla skin ENVO:08000001 2
PATO:0001429 acidic +ENVO:01000315/DF 2
PATO:0001430 alkaline ENVO:01000316 2
UBERON:0002416 integumental system ENVO:2100004 2
UBERON:0000022 feather ENVO:2100006 2
ENVO:00005801 rhizosphere ENVO:01000999 2
UBERON:0001555 digestive tract ENVO:01001033 2
UBERON:0001474 bone element ENVO:01001306 2
UBERON:0001062 anatomical entity ENVO:2100000 2
UBERON:0000160 intestine ENVO:2100002 2
ENVO:01000112 polymetallic nodule ENVO:01001629 2
None swamp ENVO:00000233 ENVO:01001208 2
None wetland ENVO:01001209 ENVO:00000043 2
None forest ENVO:01001243 ENVO:01000174 2
None polar tundra ENVO:01001625 ENVO:03400002 2
None terrestrial ENVO:01001790 ENVO:00000446 2
ENVO:01000143 marine reef ENVO:01000029 2
ENVO:01000122 marine hydrothermal vent +ENVO:01000030/DF 2
ENVO:00000015 ocean ENVO:01000048 2
ENVO:01000150 marine subtidal rocky reef +ENVO:01000050/DF 2
ENVO:01000161 marine sponge reef +ENVO:01000123/DF 2
ENVO:01001802 subtropical moist broadleaf forest *ENVO:01000226/GEN 2
ENVO:00000021 freshwater lake ENVO:01000252 2
ENVO:01000297 freshwater river ENVO:01000253 2
None gramanoid or herbaceous vegetation ENVO:01000888 1
None lichen-dominated vegetation ENVO:01000889 1
None moss-dominated vegetation ENVO:01000890 1
None pastureland or hayfields ENVO:01000891 1
None woody wetland ENVO:01000893 1
None emergent herbaceous wetland ENVO:01000894 1