tdwg / dwc

Darwin Core standard for sharing of information about biological diversity.
https://dwc.tdwg.org
Creative Commons Attribution 4.0 International
206 stars 70 forks source link

New Term - sensitiveVernacularName #522

Open qgroom opened 2 months ago

qgroom commented 2 months ago

New term

Proposed attributes of the new term: text

Jaim3Fry3 commented 1 month ago

This is an excellent, and important way to increase transparency and communication about names with the potential to cause harm that does not restrict usage but allows institutions to make their own, informed decisions. I wholeheartedly support this term addition!

nielsklazenga commented 1 month ago

This does not make a whole lot of sense. Darwin Core does not have a Name object and the best way to deal with inappropriate labels on a Darwin Core Taxon is to just not use them. One could use the informationWithheld property to indicate that the vernacular name that was used was withheld because it was deemed inappropriate.

qgroom commented 1 month ago

This does not make a whole lot of sense. Darwin Core does not have a Name object and the best way to deal with inappropriate labels on a Darwin Core Taxon is to just not use them.

It seems that you may not fully grasp the intended use case. Many vernacular names carry offensive or derogatory connotations, often specific to a particular group, language, or cultural context. To address this, several organizations have compiled catalogues or checklists of such names to inform others, ensuring that people do not unknowingly use terms with negative or harmful meanings.

This term is particularly relevant for people working in museums, botanical gardens, and other institutions where multilingual labels and interpretive materials are created. Similarly, it can benefit those writing floras and faunas by helping them understand the broader context of the names they include in their work.

In Darwin Core, taxonomic names are represented within the "Taxon" class, and the Darwin Core schema facilitates the exchange of name-related information.

This proposal originated from a symposium at the XX International Botanical Congress (Towards a more ethical science: Decolonizing Botanical Research) and during the Nomenclatural session of the Congress, where the issue of derogatory Latin names was a key topic of discussion. I have not addressed issues related to Latin names here, as that is a rather more complicated subject, with potentially different solutions.

nielsklazenga commented 1 month ago

Sorry, makes even less sense now. I understand the context, but I do not see a use case for this term in it. Inappropriate labels on specimens might be sensitive data, but they are not sensitive vernacular names. vernacularName is interpreted data. You decide what is a vernacular name and what is not, so if you think that a label is inappropriate, you just do not record it as a vernacularName, but you record it in fieldNotes or identificationRemarks if you want to exchange it at all.

It also seems counter-productive to broadcast sensitive data to the world by having special properties for it. I think for this kind of sensitive data the more appropriate options are to either withhold it or annotate it, neither of which requires any changes to Darwin Core.

qgroom commented 1 month ago

Some additional resources...

nielsklazenga commented 1 month ago

@qgroom, just to clarify my major issue with this proposal, a scientific name or vernacular name on a specimen label is not a dwc:scientificName or dwc:vernacularName, but a dwc:Identification. I am also not in favour of a dwc:sensitiveVerbatimIdentification, as I think there are better solutions for your use case outside Darwin Core, but it would at least say what (I think) you mean.

Jaim3Fry3 commented 1 month ago

It also seems counter-productive to broadcast sensitive data to the world by having special properties for it. I think for this kind of sensitive data the more appropriate options are to either withhold it or annotate it, neither of which requires any changes to Darwin Core.

I respectfully disagree on your thought that it is more appropriate to withhold this data. For institutions and professionals tasked with educating and/or communicating with the public, it is critical that we record these language use observations to track them over time. This is important data to educate one another on potential pitfalls surrounding the language of plants across the botanical community. Highlighting these concerns in a clear, consistent manner would help further our understanding of implications outside of our scientific microcosms.

nielsklazenga commented 1 month ago

@Jaim3Fry3 , I never said that. I said 'withhold or annotate' and 'annotating' is what you call 'Highlighting these concerns in a clear, consistent manner'. And until you can do that and data consumers can deal with it, you withhold (or maybe I should say 'hold back') the data you consider might be sentitive or you do not yet know what to do with. At institutions like the one I work at people make these sort of decisions all the time.

On the other hand, 'Highlighting these concerns in a clear, consistent manner' is exactly what this proposal does not do, as it does address only one very specific piece of potentially sensitive information. The proposal does not address the stated use case, as it addresses the wrong term (or it least one of us thinks so) and also, because it focuses on one specific term rather than all the entire label, the sensitiveVernacularName disappears into the Identification History as soon as there is a new Identification while the sensitive bit of data is still on the label and on the image of the specimen.

The thing with sensitive data is that some consumers want to have it and other consumers do not (or should not). Also, different producers will consider different things sensitive. Therefore, sensitive data should be dealt with at the record or data set level, not the term level. The informationWithheld might not be sufficient, but it is in record-level terms like that that the solution should be sought. This is what the Sensitive Species Extension Task Group is looking at for the type of sensitive data that is generally dealt with with dataGeneralizations (and since we apparently want to be consistent...).

tucotuco commented 1 month ago

There are many Darwin Core terms that contain sensitive content in the sense being used for this term proposal. I support @nielsklazenga approach over also needing to add sensitiveLocality, sensitiveVerbatimLocality, sensitiveRecordedBy, sensitiveCaste, sensitiveBehavior, sensitiveOccurrenceRemarks, sensitiveOrganismName, sensitiveOrganismRemarks...

nielsklazenga commented 1 month ago

If @qgroom agrees, maybe this issue can be converted to a discussion where we can discuss how to deal with this type of sensitive data (as opposed to sensitive data that can be generalised)?

ArthurChapman commented 1 month ago

I agree with @tucotuco. I understand that a Task Group is being established at TDWG to look at, and recommend Darwin Core Terms for sensitive data.

qgroom commented 1 month ago

There are many Darwin Core terms that contain sensitive content in the sense being used for this term proposal. I support @nielsklazenga approach over also needing to add sensitiveLocality, sensitiveVerbatimLocality, sensitiveRecordedBy, sensitiveCaste, sensitiveBehavior, sensitiveOccurrenceRemarks, sensitiveOrganismName, sensitiveOrganismRemarks...

As many of you know last year, @ArthurChapman wrote an in-depth report on sensitive primary species occurrence data, including recommendations for changes to Darwin Core. It’s important to note that he did not address the issue of derogatory vernacular names, as this is a separate topic. Similarly, the Sensitive Species Extension Task Group does not have this issue within its remit.

The proposed sensitiveVernacularName term caters to different use cases and has a distinct stakeholder groups.

The word "sensitive" seems to have led to confusion between these unrelated issues. If you can suggest an alternative word it might be better. e.g. flaggedVernacularName, sensitiveLanguageVernacularName

Regarding the formation of a task group, it’s generally more effective to explore other solutions first. Task groups often struggle with efficiency and participation and can sometimes hinder standardization efforts. Unless there is a significant and controversial workload to address, it might be best to avoid forming one.

Chapman AD (2020) Current Best Practices for Generalizing Sensitive Species Occurrence Data. Copenhagen: GBIF Secretariat. https://doi.org/10.15468/doc-5jp4-5g10.

nielsklazenga commented 1 month ago

@qgroom , I agree that this is out of scope for the Sensitive Species Extension Task Group (which is already there), but having special terms for different use cases and stakeholders kind of defeats the purpose of having a standard.

tucotuco commented 1 month ago

Why just vernacular names? Sensitive language permeates biodiversity data. Why not have a solution that can be applied to any or all of it, in a record-level term?

qgroom commented 1 month ago

Why just vernacular names? Sensitive language permeates biodiversity data. Why not have a solution that can be applied to any or all of it, in a record-level term?

The primary reason for focusing on vernacular names is that Darwin Core is specifically designed for exchanging taxon-based checklist data, which includes both scientific and vernacular names. Unlike place names, which can also have cultural sensitivities, we do not typically use Darwin Core for gazetteer information exchange.

Latin names present their own challenges, but we cannot arbitarily choose which Latin name to use, Though one could argue it would be good to exchange information on this too. However, Latin names are controlled by the Nomenclatural Codes, so we should really take a lead from the Codes.

There are other fields that may contain offensive remarks, but that strays into a different use case. One case that Niels has mentioned is occurrence data. The intension here is to have the ability to communicate about the sensitivity of certain names and under what context. Not to judge that it is offensive or derogatory, as that is often a matter of perspective, but to ensure this information can be shared.

nielsklazenga commented 1 month ago

Sorry, I misunderstood the use case. I still do not understand why you cannot use a general-purpose record-level term, as they would be available in the Taxon Core and the Vernacular Name Extension as well.

nielsklazenga commented 1 month ago

The intension here is to have the ability to communicate about the sensitivity of certain names and under what context.

It will not be in the initial release of TCS 2, because it is not in TCS 1, but I have plans for a VernacularName class in TCS, which would be pretty much the same as the GBIF VernacularName. So, if this can wait and people can use taxonRemarks in the Vernacular Name Extension for now, I think that might be a good context for this discussion.

MattBlissett commented 1 month ago

The word "sensitive" seems to have led to confusion between these unrelated issues. If you can suggest an alternative word it might be better.

I think "taboo" fits what's being discussed, "tabooVernacularName", and avoids the clash with the usual biodiversity meaning of a sensitive species.

gkampmeier commented 1 month ago

I agree that sensitive can be confusing. I think that "deprecated" seems more fitting. It firmly puts the context as something that is no longer used and that the reason may be that it is "taboo" but there could be other shades of reasons why it may exist, but should no longer be used (plus people often like to break taboos ;).

tucotuco commented 1 month ago

Sorry, I misunderstood the use case. I still do not understand why you cannot use a general-purpose record-level term, as they would be available in the Taxon Core and the Vernacular Name Extension as well.

I really do not understand either. I feel the maintenance hackles rising on the back of my neck with this one - presaging the day when a general solution is preferred and then trying to deprecate this specialized term. We weren't even able to successfully eradicate dwc:individualCount and that keeps given me the kind of bad taste that this proposed term anticipates.

TaniaGLaity commented 1 month ago

wrt the https://www.tdwg.org/community/dwc/sensitive-species/ we are also interested in tackling the problem of data which may be sensitive for cultural reasons. we are not only considering obfuscation / generalisation of occurrence records but also withholding some record level / dataset level attributes where they are considered sensitive.