tdwg / dwc

Darwin Core standard for sharing of information about biological diversity.
https://dwc.tdwg.org
Creative Commons Attribution 4.0 International
204 stars 70 forks source link

Change term - locality #315

Closed pzermoglio closed 3 years ago

pzermoglio commented 3 years ago

Change term

Current Term definition: https://dwc.tdwg.org/terms/#dwc:locality

Proposed new attributes of the term:

Many questions have landed on my desk over the years about how to capture information about protected areas using Location terms. As no specific term exists in DwC for that information, our recommendation has always been "include it in the locality / verbatimLocality field as part of the locality description" (usually append/prepend to whatever is there already).

Currently dwc:locality has only one example: 'Bariloche, 25 km NNE via Ruta Nacional 40 (=Ruta 237)'.

I find it would be useful for users to have an extra example of such strings containing protected areas info, it would probably save them a lot of time.

Possible example: 'Olympic National Park, Queets Rainforest"

dschigel commented 3 years ago

Would not you prefer to rather see a standard well-maintained resource used as reference and a separate spatial filter for protected areas? Could we use Protected planet similarly to GADM https://www.gbif.org/occurrence/search?occurrence_status=present&gadm_gid=USA.27.56_1. This is does not prevent indication of protected areas in the locality strings, but one would be able to do much more if https://www.protectedplanet.net/en/thematic-areas/wdpa?tab=WDPA is a vocabulary. Any database / data system using WDPA as spatial filter would be able to claim +100 to conservation karma / relevance / usefulness. There are other possibilities including Emerald network, Natura 2000, UNESCO heritage sites etc.

pzermoglio commented 3 years ago

@dschigel Definitely so! But I think my request was some 10 or 20 steps behind that :)

One possible way, first step to move your idea forward would be to put in a request for a new dwc term for protected areas, for which the recommendation would be to use a controlled vocabulary from WDPA. I was very surprised that I could not find any request of the like, ever, for such a term in DwC. ABCD 3.0 does not have a term for this either (have not gone into its history to see if at any point there was one or a request for one).

Should we propose a new term?

dschigel commented 3 years ago

@pzermoglio thanks, did not mean to derail your proposal. So +1 for adding the protected area example to locality string, and definitely I support adding a new term for conservation areas. In this process, I would seek for confirmation from conservation gurus of WDPA as of the global authority for protected areas, and some proof of long-term stability. Not sure that https://dwc.tdwg.org/terms/#location would be the most natural home... We might think of a new category altogether, e.g. ConservationContext, similar to https://dwc.tdwg.org/terms/#geologicalcontext.

qgroom commented 3 years ago

I support the proposal, but I also agree with @dschigel . I don't think we are too far off doing this. @aguentsch is leading an initiative to link geographical entities to identifiers, particularly those of https://www.geonames.org/. This is particularly useful for old specimens that might only be located to a geographic feature, such as a river or mountain, where a derived georeference is vague. Nevertheless, conservation area indicator would still be useful as many conservation areas are disjunct.

tucotuco commented 3 years ago

The original proposal in this thread is a non-normative change and can be implemented. The rest of the discussion should go into other issues, or into Darwin Core Questions & Answers (https://github.com/tdwg/dwc-qa/issues), as it will essentially be lost when this issue has been implemented and closed.

albenson-usgs commented 3 years ago

I'm not sure if any other issues were created from this but I have a similar issue with data providers wanting to indicate that the dataset is in an ABNJ (area beyond national jurisdiction) which is an important flag for data that are outside EEZs since the UN is currently focusing on this data. I opted for using locationRemarks as I tend to think of locality being a specific location - ie giving more specificity to all the terms before it continent ->country ->stateProvince ->county ->municipality ->locality. Is that not the correct use of locality?

tucotuco commented 3 years ago

@albenson-usgs Yes, that is the correct use of locality. It is supposed to be more specific than all of the other terms that can be applied. Presumably there is still location information, such as "Western South Pacific Ocean" in waterbody, An ABNJ is a characteristic of a place, not a place, so in the absence of a way to capture that explicitly, locationRemarks would be appropriate. One could go one step further and create a key:value pair to capture it explicitly in dynamicProperties rather than have it lost among free-text remarks, but as always, that is not as strong a solution as having a well-defined term with a controlled vocabulary.

deepreef commented 3 years ago

I am also involved in a project where this is extremely important information to have. At the moment, the best way to harvest those occurrences in aggregate is via coordinates, but that can be a bit cumbersome as ABNJ are not easily framed as contiguous polygons. The problem with using the Waterbody, locationRemarks and dynamicProperties is that providers don't use these consistently. Labels like "Western South Pacific Ocean" don't capture it, because there are islands (and, hence, NJs) within that patch of ocean. Because the specific characteristic of interest in this context has to do with national jurisdiction, it seems to me that Country would be an appropriate place to capture it. Simply leaving it blank is not sufficient, of course. But I wonder if the TDWG community could converge on a standard value, such as "ABNJ" or "None". Do we know if any other standards community has addressed this? I don't believe there is a designated placeholder for this within ISO 3166-1 alpha-2.

albenson-usgs commented 3 years ago

@deepreef that's also what the data provider was advocating for- putting it in country. I decided against it since it's not an option in ISO 3166-1 alpha-2. Maybe we need to petition ISO to add it?

deepreef commented 3 years ago

Thanks, @albenson-usgs ! My thinking was that the ISO 3166 standard puts constraints on what values to use for countryCode, but I think values for country are less constrained. There is a recommendation best practice for country to use a controlled vocabulary (e.g., Getty Thesaurus of Geographic Names), but that seems less precise than the recommendation for ISO 3166-1-alpha-2 country codes in countryCode.

I guess it's a bit of a slippery slope to start making this stuff up within TDWG-space. But I don't know what's involved with petitioning ISO to change things. Moreover, as ABNJ is not itself a "country" per se, I suspect that ISO wouldn't want to create a Code for it. Technically this also applies to country within DwC as well -- which is why I was thinking maybe an explicit 'None' would be better than a non-country value like 'ABNJ'.

Personally, I don't have any strong preference for how to do it; my bigger concern is to make it simple & straightforward enough that we can encourage consistency of implementation among data providers who have occurrence records within ABNJ.

MattBlissett commented 3 years ago

The XZ code for "International Waters" is used by UNLOCODE: https://service.unece.org/trade/locode/xz.htm . (All the "X" codes in ISO 3166 are for private use.)

GBIF's Country vocabulary has included XZ for a long time, although we have fewer than a thousand occurrences with this value.

tucotuco commented 3 years ago

I have changed the title of the issue and prepended a templated term change request to the original comment so as not to have to make a separate issue and relate it to the discussion in this one. In addition to the additional example, I propose to move the legacy embedded usage comments out of the definition.

baskaufs commented 3 years ago

The handling of geographic subdivisions is described in detail in section 2.7.5 of the RDF guide. The way of handling geographic terms is to use the IRI of the lowest level geographic subdivision as a value for dwciri:inDescribedPlace and use the information associated with the IRI to infer higher-level geographic subdivisions as desired.

dwciri:inDescribedPlace is repeatable and that section actually gives a specific example that applies to the issue raised here. In the example, one value is given for the lowest level political subdivision (Sevier County, Tennessee, USA) and another value is given for a protected area (Great Smoky Mountains National Park).

So making this change should have no implications for dwciri: terms as they already handle the situation in question.

EstebanMH-SiB commented 3 years ago

We endorse this proposal on behalf of @SiBColombia

dagendresen commented 3 years ago

Even if (very) late, support the addition of examples to the locality term. Apropos, locationID could carry the identifier to a protected area in WDPA, GeoNames, Wikidata, etc...

tucotuco commented 3 years ago

@dagendresen Yes, it could, but not if the location is more specific than the protected area, as the locationID refers to the combination of all terms in the Location class, not any one part of it. Thus, if a location that is a set of coordinates within a protected area, locationID would not be appropriate. If the protected area is the most specific part of the higher geography, it could be used for the higherGeographyID as well, or instead of the locationID.

tucotuco commented 3 years ago

Done.