Open tucotuco opened 10 years ago
See also Issue #37, Issue #39, and Issue #40.
Opened public discussion on tdwg-content (http://lists.tdwg.org/pipermail/tdwg-content/2015-March/003507.html).
Points of entry for previous discussions on tdwg-content:
http://lists.tdwg.org/pipermail/tdwg-content/2013-September/003066.html
http://lists.tdwg.org/pipermail/tdwg-content/2015-March/003507.html
http://lists.tdwg.org/pipermail/tdwg-content/2015-April/003532.html
http://lists.tdwg.org/pipermail/tdwg-content/2015-May/003536.html
There was a public review of this and related proposals in 2015 in which there were observations that the proposal as presented does not make sense. The ENVO classes can not be Darwin Core properties. Instead, new properties would have to be minted for Darwin Core with the recommendation to have the range of values come from ENVO classes. In any case, There is no evidence in the discussion history for demand for these terms. If anyone wants to move this proposal forward, please provide a new term definition addressing the property/class issue and provide evidence of sufficient demand.
I was in error to note that there was a need for a demonstration of demand. The proposal was a direct result of an international workshop. Also, the revised term proposal has already been proposed. With an updated comment showing just the proposal this proposal will be ready for public review.
Note that a biome term, in which a Location could be distinguished as terrestrial, marine(http://purl.obolibrary.org/obo/ENVO_00000447), or freshwater (or any of their subclasses), would be extremely useful for georeferencing (see Georeferencing Best Practices). It is common to find Location descriptions that are ambiguous in terms of the biome, and the georeference would be quite different based on biome. It would be a great benefit for the Biodiversity Enhanced Location Services (see https://www.idigbio.org/content/darwin-core-hour-2-bbqs-imagining-global-gazetteer-georeferences) to have biome as a standard term, a property of an Event.
Updated term change request:
Proposed attributes of the new term:
ENVO:terrestrial biome
, ENVO:freshwater biome
, ENVO:marine biome
Also useful for interoperability with MIxS. See https://github.com/EnvironmentOntology/envo/wiki/Using-ENVO-with-MIxS.
HI all - we're revisiting this in the GBWG Task Group on mapping MIxS to DwC
I think the time is right to figure out how to support use of ENVO (or other environmental ontologies) in corresponding DwC fields.
Our ENVO annotation guidance for MIxS may be of interest. There, we relaxed the former biome field to "env_broad_scale", mostly because of the lack correct usage / confusion around the biome concept itself in the MIxS user base.
The annotation guidance there could offer a solution for the DwC case too. As noted here:
Instead, new properties would have to be minted for Darwin Core with the recommendation to have the range of values come from ENVO classes.
A set of new properties for environmental contextualisation would be a valuable addition / refinement of DwC.
This will hopefully help unpack and avoid semantic ambiguity with fields like "habitat" (often not the same - and technically never the same - as the environment of an observation/sample) (xref https://github.com/gbif/vocabulary/issues/75) and has some bearing on #39 and #40.
@tucotuco how can I help to move this along?
Given what I've read/understand - I think the best way to go is to add two fields that roughly correspond to the MIxS broad and local scale environment fields (medium/material is a bit microbially focused, so may be better off in the MIxS extension to DwC)
The broad scale term would accept classes corresponding to biomes, ecoregions, ecozones, and the like. The local scale more about what was immediately around the observation/sample (a village, moraine, hot spring, etc).
@pbuttigieg From a pragmatic perspective it would good to get the present proposal through the review process. All of the justification requirements (demand, efficacy, and stability) have already been met. To add another term may put a wrench in that. So my question for you is, "Given the proposal as it stands, would it create any obstacles (require further changes to the definition) in the future with the addition of a complementary term, should the demand for it be manifested?"
@tucotuco many thanks
From a pragmatic perspective it would good the present proposal through the review process. All of the justification requirements (demand, efficacy, and stability) have already been met. To add another term may put a wrench in that.
Sure. In that case, we'll stick with biome and then start a new request for other environmental qualifiers.
So my question for you is, "Given the proposal as it stands, would it create any obstacles (require further changes to the definition) in the future with the addition of a complementary term, should the demand for it be manifested?"
No, not if we tighten up the definition and stay away from colloquial / obscure usage of the term. This is tricky, as the term usage has become quite dilute over time. IMO, we should thus emphasise the requirement for climax communities in biomes. We'll echo this on the ENVO side.
So some rephrasing:
Definition of the term: An ecosystem in which the dominant ecological communities have reached their final successional states, forming stable climax communities.
Usage comments (recommendations regarding content, etc.): Biomes are typically identified by their patterns of ecological succession and climax communities. They are defined by communities of plants, animals, and other organisms which have - by and large - stabilized (i.e. reached a climax successional state) with respect to their prevailing environmental conditions and climate. Unlike ecozones, biomes are not defined by genetic, taxonomic, or historical similarities. Recommended best practice is to use a controlled vocabulary such as the set of subclasses of the biome class (http://purl.obolibrary.org/obo/ENVO_00000428) of the Environment Ontology (ENVO).
The above gives us something that won't overlap (too much) with other environmental fields (ecozones, ecoregions, realms, habtiats, etc)
Excellent. Thank you for that. I have updated the change proposal. Issue #40 could benefit from similar scrutiny?
This new term should have a dwciri:
analog. I believe that the following modifications of the proposed metadata values would be appropriate for dwciri:biome
:
http://purl.obolibrary.org/obo/ENVO_01000228
After writing a long comment in the proposal for dwc:environmentalMaterial
, I came back to this term, since it also suggests using an OBO ontology for controlled values. When I first read through this proposal, the suggested form of values did not strike me as problematic in the way that the suggested values for dwc:environmentalMaterial
did because the example tropical moist broadleaf forest biome [ENVO:01000228]
includes a legitimate label and compact URI (ENVO:01000228
).
However, many of the concerns that I raised in that other comment apply here:
envo:
or ENVO:
?tropical moist broadleaf forest biome
changes, or somebody puts in too many spaces?I understand that many of us are wizards at parsing and cleaning strings for the purpose of figuring out what people intend for their values. But honestly, we could actually just create a real controlled vocabulary with explicitly specified controlled value strings and not have to do that work. In my other comment, I suggested a mechanism for how to do that and still link to the OBO ontology for the purpose of defining the terms.
We actually know how to construct controlled vocabularies with controlled value strings and multilingual labels now. The real challenge is deciding what the values should be and how they should be defined. But if we already know that we want to use any ENVO term that is a subclass of the biome class as the concepts for the vocabulary, then no "deciding" is necessary. it is just a technical task to set up the controlled vocabulary and let people know what string they should use for each term. The controlled vocabulary would not even have to be officially adopted as part of Darwin Core - it could just be provided for reference and convenience and be added to as ENVO evolves.
I like this solution. Can we defer the creation of the controlled vocabulary for now? Or is it better to defer the review of this term (and environmentalMaterial) and do what you are suggesting under a Task Group? That seems like a more solid way forward, especially since it would be trailblazing and therefore a good model for similar future endeavors.
@tucotuco In this case, I don't think a task group is necessary. I invested about a half hour in writing a bit of Python code and have generated the necessary controlled values. Code here and table of controlled values here. The code can be reused if we decide to take this approach for any other controlled vocabularies.
So here is how I would amend this proposal to make it viable. It basically describes in words what I did in the Python script.
The Darwin Core maintenance group will construct a controlled vocabulary to be used with this term in the following manner:
dwciri:biome
.dwc:biome
will be constructed from the labels of subclasses by the process described below.Because the ENVO terms and their labels are not controlled by a standards process and may change over time, to maintain stability of values it is recommended that implementers refer to the TDWG-generated values rather than constructing them directly from the ENVO ontology. As subclasses are added to the ontology, the Darwin Core Maintenance Group may add additional values, but the controlled value strings will not change if the labels in ENVO change. If subclasses are removed from the ontology, the Maintenance Group may at its discretion deprecate controlled values after considering implications for stability.
The process for generating the controlled value strings is as follows:
If this process were followed, I don't see any problem with just moving forward on this proposal and letting it stand or fall on its own merits since any issues that I had related to the recommended controlled values should be solved with the process I just outlined.
One question we should consider is whether it is desirable or necessary to just make this controlled vocabulary officially part of Darwin Core. I think that in the past my opinion about this has been that if we think the CV is likely to be stable enough that managing changes to it won't be particularly odious for the Maintenance Group, we should just go ahead and make it officially part of the DwC standard. In this particular case, I think it's probably stable enough that we could do that if we wanted to.
However, there is one complication that we have not dealt with yet in any of the existing CVs. If we recommend that the ENVO IRIs be used to denote the controlled values rather than minting our own (as I think we should), then following the Standards Documentation Specification for machine readable metadata will probably result statements like this:
<http://purl.obolibrary.org/obo/ENVO_01001230> rdf:type skos:Concept.
But the definition in OBO will already assert something like this:
<http://purl.obolibrary.org/obo/ENVO_01001230> rdf:type owl:Class.
In other words, we would be asserting that the ENVO terms were both Concepts and Classes. There isn't anything technically wrong with doing that, but it has certain implications on reasoning if our assertions are combined with the ontology itself. See Section 5.2 of the SKOS Primer for details. But I'm not sure that anybody would ever actually do that kind of reasoning anyway, so maybe it's irrelevant. But it should probably be considered on a policy level before actually adding the CV as part of a TDWG standard.
This proposal has been labeled as 'Controversial' and in need of a task group for resolution. It is no longer part of an active public review.
If one was to charter a task group for this term, then which interest group would it be under?
It should be the Darwin Core Maintenance Group if it is an addition to DwC
There is some question about whether a Task Group is needed (see https://github.com/tdwg/dwc/issues/38#issuecomment-829208723). What would the Task Group do aside from propose the term or terms plus controlled vocabulary to be added? Since this issue was last discussed we have clear ways forward with controlled vocabulary creation and maintenance, which was a big part of the issue with this term when proposed. If nothing more is really needed, then I support @baskaufs position that a Task Group is unnecessary.
Note that the ENVO Biome subclasses do not match the IUCN typology which has 25 biomes across 4 core realms and 6 transitional realms
And some of the ENVO Biomes appear to correspond with IUCN Realm
.
i.e. the ENVO hierarchy does not match the IUCN typology.
Thank you for catching this @dr-shorthair - in that case, I think this mismatch should be handled by a task group. I wonder if it makes sense to propose an interest group (Environment?) with individual task groups, since there is a hierarchical structure of realms - biomes - ecosystems - habitats that should be addressed as separate terms (not necessarily all at once, but by individual task groups).
Alternatively, could both ENVO and IUCN sources be accepted as sources? The main issue I see with the IUCN typology is that it does not follow a vocabulary structure and does not have IRIs for the concepts. However, the upcoming guide for publishing freshwater data to GBIF relies on the IUCN typology while the DNA-derived data publishing guide relies on ENVO for controlled values (to align with MIXs).
could both ENVO and IUCN sources be accepted as sources?
I don't think ENVO is considered a 'source'. The usual approach is to adopt then adapt a system that has been proposed by subject matter experts working under the auspices of some 'authoritative' body, preferably with international standing. I think that would be IUCN for biomes etc. Or maybe I misunderstand how ENVO design and governance works?
The IUCN Global Ecosystem Typology is not that old, so I think what you see in ENVO was created in the absence of an authoritative typology. But now there is one, so ENVO should attempt to assimilate it, and formalize it using the OBO rules. FWIW the URLs for the web pages in IUCN GET are intended to be persistent, so they can be used as the 'source' for terms as they are brought into ENVO.
You say that GET "does not follow a vocabulary structure" - I don't really understand what you mean by that. It has a hierarchy, the Biomes specialize the Realms, and the Functional Groups specialize the Biomes. There are textual definitions and there is a global dataset that is classified using it. No, it doesn't use the strict OBO genus-differentia logic in a formal way, but there lies the opportunity.
ENVO can then feedback to IUCN if necessary. I am liaising with David Keith already - who is the primary custodian - so I may be able to help with this.
FWIW the URLs for the web pages in IUCN GET are designed to be persistent, so they can be treated as IRIs to serve as the 'source' for terms as they are brought into ENVO.
FWIW the URLs for the web pages in IUCN GET are intended to be persistent, so they can be used as the 'source' for terms as they are brought into ENVO.
Sorry, what I meant was that I was not sure there were persistent URLs for the concepts and the whole 'typology' structure is a bit atypical for a controlled vocabulary , but I do not see why it couldn't be adapted, as you propose.
I will draft a task group charter and submit it to the Darwin Core Maintenance Group. Please let me know if you want to be involved in the draft @tucotuco @dr-shorthair and others. Then we can continue the discussion when the formalities are in place.
In short, I will propose that two new terms are defined by the task group: realm and biome with controlled vocabularies for both.
I will draft a task group charter and submit it to the Darwin Core Maintenance Group. Please let me know if you want to be involved in the draft @tucotuco @dr-shorthair and others. Then we can continue the discussion when the formalities are in place.
As the convener I will have to review the charter and submit it to the Executive Committee anyway, so I am happy to be involved in the drafting.
New term
Proposed new attributes of the term:
tropical moist broadleaf forest biome [ENVO:01000228]
Was https://code.google.com/p/darwincore/issues/detail?id=189
Reported by gtuco.btuco, Sep 25, 2013
==New Term Recommendation== Submitter: John Wieczorek on behalf of the May 2013 GBIF hackathon-workshop on Darwin Core and sample data
Justification: see "Meeting Report: GBIF hackathon-workshop on Darwin Core and sample data (22-24 May 2013)" at http://www.gbif.org/orc/?doc_id=5424
Term Name: biome Identifier: http://purl.obolibrary.org/obo/ENVO_00000428 Namespace: http://purl.obolibrary.org/obo/ Label: Biome Definition: A major class of ecologically similar communities of plants, animals, and other organisms. Biomes are defined based on factors such as plant structures (such as trees, shrubs, and grasses), leaf types (such as broadleaf and needleleaf), plant spacing (forest, woodland, savanna), and other factors like climate. Unlike ecozones, biomes are not defined by genetic, taxonomic, or historical similarities. Biomes are often identified with particular patterns of ecological succession and climax vegetation. Comment: Examples: "flooded grassland biome", "http://purl.obolibrary.org/obo/ENVO_01000195". For discussion see https://code.google.com/p/darwincore/wiki/Event (there will be no further documentation here until the term is ratified) Type of Term: http://www.w3.org/2000/01/rdf-schema#Class Refines: Status: proposed Date Issued: 2013-09-25 Date Modified: 2013-09-25 Has Domain: Has Range: Refines: Version: http://purl.obolibrary.org/obo/ENVO_00000428 Replaces: IsReplaceBy: Class: http://rs.tdwg.org/dwc/terms/Event ABCD 2.0.6: not in ABCD (someone please confirm or deny this)
Sep 26, 2013 comment #1 gtuco.btuco Based on initial discussions on tdwg-content, modified the proposal to make a new DwC property term that recommends the ENVO class as the range, as follows:
Term Name: biome Identifier: http://rs.tdwg.org/dwc/terms/biome Namespace: http://rs.tdwg.org/dwc/terms/ Label: Biome Definition: A major class of ecologically similar communities of plants, animals, and other organisms. Biomes are defined based on factors such as plant structures (such as trees, shrubs, and grasses), leaf types (such as broadleaf and needleleaf), plant spacing (forest, woodland, savanna), and other factors like climate. Unlike ecozones, biomes are not defined by genetic, taxonomic, or historical similarities. Biomes are often identified with particular patterns of ecological succession and climax vegetation. Recommended best practice is to use a controlled vocabulary such as defined by the biome class of the Environment Ontology (ENVO). Comment: Examples: "flooded grassland biome", "http://purl.obolibrary.org/obo/ENVO_01000195". For discussion see https://code.google.com/p/darwincore/wiki/Event (there will be no further documentation here until the term is ratified) Type of Term: http://www.w3.org/1999/02/22-rdf-syntax-ns#Property Refines: Status: proposed Date Issued: 2013-09-26 Date Modified: 2013-09-26 Has Domain: Has Range: Refines: Version: biome-2013-09-26 Replaces: IsReplaceBy: Class: http://rs.tdwg.org/dwc/terms/Event ABCD 2.0.6: not in ABCD (someone please confirm or deny this)