tdwg / dwc-qa

Public question and answer site for discussions about Darwin Core
Apache License 2.0
49 stars 8 forks source link

How to store information of multiple life stages in a museum sample when everything is together #192

Open EstebanMH-SiB opened 1 year ago

EstebanMH-SiB commented 1 year ago

We have been trying to implement the best practices of not putting various instances of life stages in the same record e.g.( 8 Adults | 1 Juvenile) following an advice that @tucotuco gave us in this comment.

However, some collections keep several individuals from different life stages and sexes in one jar with the same catalog number and they cannot divide them in various records, because they publish one record per catalog number, and creating more it is not an option. In other cases, we have a total number of individuals and know that there are male and females in a sample, but not the specific number of each one, so it is not possible to divide the record in two and keep the same information in individualCount, because it will be incorrect.

So, we are wondering what is the best practice in those cases, should we keep a comment in organismRemarks saying that individuals belong to several life stages and leave empty the field lifeStage because they do not fall under just one instance? Or maybe in this exceptions we can keep multiple instances in the field lifeStage?

Thansk for the help!

CecSve commented 1 year ago

Life stage and sex are two distinct fields in DwC that should not be intermixed.

In other cases, we have a total number of individuals and know that there are male and females in a sample, but not the specific number of each one, so it is not possible to divide the record in two and keep the same information in individualCount, because it will be incorrect.

GBIF are currently working on a controlled sex vocabulary where the concept 'mixed' will be an option and such data can be captured. It is not yet implemented, though. The 'mixed' term is currently not an option for lifeStage so the bulk samples would need to be separated into individual occurrence records as mentioned previously.

debpaul commented 1 year ago

Please @CecSve @tucotuco @timrobertson100 @baskaufs is this controlled sex vocabulary work linked to TDWG?

baskaufs commented 1 year ago

I'm not aware that it has any official connection to TDWG at this point.

CecSve commented 1 year ago

Hi @debpaul – GBIF currently interpret verbatim data to fit some enumerations we have. All of that is handled in the code. We are working on shifting from our old enums to vocabularies to better interpret the huge variation of verbatim values for some of the fields and to make the whole process more flexible. At the moment it is solely an internal exercise. However, we are not applying any major changes to (e.g., adding a bunch of new concepts) and we stick to DwC. My plan is to present the vocabulary work for next years TDWG, so that’s the link so far ☺️

tucotuco commented 1 year ago

@debpaul, @baskaufs, and @CecSve

@pzermoglio and @tucotuco have been involved in the development of the vocabularies and testing of the GBIF vocabulary service as a viable solution to the call for a platform for community-vetted vocabularies motivated by the TDWG Biodiversity Data Quality Task Group 4 on Vocabularies of Values, which @pzermoglio convenes.