tdwg / dwc-for-biologging

Darwin Core recommendations for biologging data
Creative Commons Attribution 4.0 International
13 stars 3 forks source link

informationWithheld / dataGeneralizations #33

Closed peterdesmet closed 2 years ago

peterdesmet commented 3 years ago

In a lossy transformation to Darwin Core we will loose information that is available in the source dataset. Should this be indicated for every record in informationWithheld and/or dataGeneralizations (inflating unzipped size), at dataset level (in the metadata), or both?

For the movebank-gps data case, I've currently opted for

Note that I do indicate datasetName for every record, as that is available in the source records.

Any feedback regarding what to do best here?

peggynewman commented 3 years ago

That sounds like a reasonable approach. Is there a GBIF/OBIS approach to dataGeneralizations? ALA uses it specifically in reference to location precision, in which case informationWithheld might make more sense to put the subsample info into (ie "we withheld 50 records").

peterdesmet commented 2 years ago

The movepub R package now implements dataGeneralizations as described above (e.g. subsampled by hour: first of 13 record(s)). For informationWithheld we opted for the static value see metadata for all records. The metadata will describe that the Darwin Core dataset is derived from a source dataset that might contain more information.

Closing this issue.