gbif-norway / helpdesk

Please submit your helpdesk request here (or send an email to helpdesk@gbif.no). We will also use this repo for documentation of node helpdesk cases.
GNU General Public License v3.0
3 stars 0 forks source link

Event ID column lacking for some field note datasets #168

Closed samaperrin closed 6 months ago

samaperrin commented 8 months ago

I'm working on a project (a recent version of which was submitted for the Ebbe Nielsen Challenge) which integrates disparate data sources for species across (at the moment) Norway. The downloadable DWC version of several valuable data sources, most notably field notes datasets from scientific institutes (Agder naturmuseum for example) have occurrence IDs, but no column dedicated to an event ID at the moment.

It seems that the occurrence ID implies that a number of species were found in one place, but confirmation through the production of an eventID would be excellent as it enables us to identify which species were NOT found during a particular sampling event.

rukayaj commented 8 months ago

Thanks for contacting us @samaperrin! I haven't had a chance to look at this yet, but I (or perhaps one of my colleagues) will take a look in the next few days.

dagendresen commented 8 months ago

eventIDs are unfortunately not mandatory for species occurrence type datasets and not yet common to find from museum collection based datasets.

I tend to think that we should treat ALL occurrences as events, and thus treat all occurrenceIDs as a type of eventID (the "dwc:Occurrence" is the intersection of a "dwc:Event" and the presence of a "dwc:organism" at the event). Thus, obviously, multiple occurrenceIDs could map to the same eventID.

Museum collection datasets are often an opportunistic collection of "species occurrences" (material collected the "Occurrence"). And a reconstruction of (collecting) events would require somebody to do the work of identifying and linking the events to the specimens (and thus to the species occurrences (occurrenceIDs) they originate from).

rukayaj commented 6 months ago

@samaperrin Sorry for only getting back to you now. I think you're talking about https://www.gbif.org/dataset/0efbb3b3-6473-414a-af91-cdbcbcd8aba1, right? Have you tried contacting per.asen@kristiansand.kommune.no to confirm if the dataset is actually a sampling event dataset and that it's possible to infer "absences" (as in, this species was not found) in the way you would like? This is quite an old dataset so I'm not sure if we will get a response, but I would be reluctant to convert it to a sampling event without checking with the dataset owner first. And as @dagendresen says, it might take some detective work to reconstruct sampling events.

We do try and publish sampling event type datasets correctly as sampling events, and we will certainly make sure we do in the future. Older datasets are a bit of a problem though.

samaperrin commented 6 months ago

Hi @dagendresen and @rukayaj , my own apologies for missing Dag's response initially. I have indeed contacted the dataset owners and found that they contain an eventID within the occurrence IDs found in the original DWC version, so it's simply a matter of a bit of text editing to derive an eventID.