BioKIC / symbiota-docs

Symbiota software centralized hub for documentation
https://biokic.github.io/symbiota-docs/
27 stars 8 forks source link

DwC-A handler needs to parse recordedByIDs and identifiedByIDs #619

Open mandrewj opened 3 years ago

mandrewj commented 3 years ago

The Biorepo portal (and symbiota light) now accept ORCID identifiers in the identifiedBy and recordedBy fields: https://github.com/BioKIC/Symbiota-light/commit/e15726b285ffc2240d0a50a59bde94bf7359dbad

However, when these data go to GBIF they are presented as a text string which is somewhat cluttered. See for example: https://www.gbif.org/occurrence/2620847404

GBIF has 2 fields: gbif:recordedByIDs and gbif:identifiedByIDs - see documentation here: https://twitter.com/GBIF/status/1247133356724289536 https://gbif.github.io/dwc-api/apidocs/org/gbif/dwc/terms/GbifTerm.html#recordedByID

If an ORCID identifier is found, this identifier should go to the appropriate IDs field and only the remainder of the person's name should be delivered to the recordedBy field.

dshorthouse commented 1 year ago

Bumping this. Am also seeing Wikidata Q numbers here like on https://www.gbif.org/occurrence/4140918829. FWIW, Bionomia strips all these out during its agent parsing routine whereas had there been structured URIs in dwc:recordedByID and dwc:identifiedByID it'd resolve 'em. Likewise, filtering on GBIF proper would struggle with these identifiers embedded in dwc:recordedBy or dwc:identifiedBy.