gbif / rs.gbif.org

GBIF machine-readable resources
https://rs.gbif.org
11 stars 13 forks source link

Downloading data with eMoF extension #112

Closed lhmarsden closed 3 months ago

lhmarsden commented 11 months ago

When scientists publish Darwin Core Archives, they often need to include an eMoF extension with all the 'other' measurements they take related to their occurrences. With more and more scientists now using Darwin Core Archive, this configuration is becoming more important.

In Norway, the national research council now says that any projects it funds must publish their data following the FAIR data management principles.

I am often asked how to get data out of a Darwin Core Archive. For events and occurrences, this is quite easy. However, for eMoF is not particularly easy to work with. Ideally, one would be able to scan through the measurementType column, find all unique 'parameters' and create one column per parameter that can be included within the 'SIMPLE' CSV that one can download.

I don't think this should be too hard to implement either.

image

Currently, one is not getting all the data when they hit download. And I don't think one is warned about the data one is not getting either, which is potentially problematic.

This might expand to other extensions. Are data from all GBIF registered extensions included in the 'SIMPLE' download?

MattBlissett commented 4 months ago

Thanks,

Extension data can now be downloaded through the API — for adding this to the website see https://github.com/gbif/portal16/issues/1910

API documentation: https://techdocs.gbif.org/en/data-use/api-downloads#verbatimextensions

"Flattening" the data as you describe isn't really an option — one occurrence might have many measurements, and we'd lose that structure if we tried to add this to the simple-format downloads.