LTER-LIFE / FDFDT

FAIR Data for Digital Twins
0 stars 0 forks source link

Define events and DwC-A structure #4

Closed CherineJ closed 3 months ago

CherineJ commented 5 months ago

It seems to be a reoccurring issue when trying to fit data to the Darwin Core Archive structure (star schema) to define what an event is and how the eventIDs and the connection between the core and the connections are organised.

The cricket dataset contains many derived variables (cf. #1) and no information about the points in time measurements have been taken. This makes it difficult to define what measurements belong to the same event and what defines an event in general.

Additionally, it is discussable whether event as the core of the DwC-A is the most suitable option or whether an occurrence core would be better. This seems to be a general issue/discussion for users of DwC-A, as can bee seen here.

The difficulty of linking all extensions to one fixed core file again underlines the need for a more flexible model (which hopefully will be launched soon with the new data model of GBIF.

CherineJ commented 3 months ago

Conclusion

The crucial difference between the event core and the occurrence core file lies in the data collection. Is the data collected following a certain protocol with documented/known sampling effort and the how, where, and when of the collection is known, it is sampling event data, resulting in an event core. This is the case for most research data. Occurrence data has an unknown sampling effort and does not follow sampling protocols.