AtlasOfLivingAustralia / biocache-store

Occurrence processing, indexing and batch processing
Other
7 stars 24 forks source link

Add more fields to GBIF export #221

Closed ansell closed 7 years ago

ansell commented 7 years ago

GBIF would like more fields added to the exported archives, specifically:

collectionID
datasetID
datasetName 
language/dcterms:language 
modified/dcterms:modified 
rightsHolder/dcterms:rightsHolder 
fieldNumber
georeferencedDate 
higherClassification 
higherGeography 
institutionID 
lifeStage 
municipality 
occurrenceStatus 
ownerInstitutionCode 
preparations
sex
subgenus 
typeStatus 
verbatimLocality
geodeticDatum 
scientificNameAuthorship

associatedMedia was also requested, but it is not useful in its unprocessed form (as there are a large number of relative URIs in the unprocessed data that will make no sense in a pure CSV Darwin Core Archive, and we aren't sending processed data to GBIF currently. We need to examine the associatedMedia situation in future.

ansell commented 7 years ago

One minor side-effect to my processing/analysis will be relocating/renovating the use of /tmp when analysing archives on cave as its /tmp is ~5GB and the archives will be larger than that when uncompressed, as some are already pushing close to that limit in their current form.