Closed peterdesmet closed 3 years ago
Feedback welcome! I'll merge this if no one responds within a week, as discussion can always happen later too.
Thanks @peterdesmet
As advised by @timrobertson100 I have now mapped the data as an Occurrence core (observations) + Multimedia extension (images), which allows to include the evidence for the occurrence.
My rationale was to focus on the richest common data GBIF could support, while knowing the original dataset always provides the full view. It is also the format used in Publishing Camera Trap Data: A Best Practice Guide
Realize this is already resolved but why can't you use associatedMedia
with Event Core with Occurrence extension?
@albenson-usgs yes, using Event Core + Occurrence extension (with an associatedMedia
field) would also be a valid solution. That field would then be a concatenated list of image URLs. That is a bit harder to parse and does not allow to add more metadata for the images. @timrobertson100 any feedback on this?
@peterdesmet why not EventCore + Multimedia extension? And keep all events and include absences.
The Resource Relationship Extension could be a solution too?
@wardappeltans: why not EventCore + Multimedia extension?
Because I would like to link multimedia to the occurrences - which breaks the star schema if there is an event core - without making it too complicated. If anyone has easy solutions to solve this (that GBIF can parse), we can take that approach.
I think there are two main approaches:
I think it is fine to take lossy approach 2, since approach 1 is already available in the source dataset.
I like approach 2 as you describe it Peter. That is, after all, what we're here for. The only reason I suggested Event Core with Occurrence Extension and associatedMedia
is I've seen Occurrence Core with associatedMedia
used well in the NOAA Deep Sea Corals database so thought it might work well here too.
@timrobertson100 how does GBIF process associatedMedia
?
Can it be used to link to media URLs, e.g.
https://www.agouti.eu/api/uploads/deployment-images/20191001132016-untitled/05c05f7b-6c38-4a06-a3b0-145de226c8ad/20200211215750-RCNX0052.JPG | https://www.agouti.eu/api/uploads/deployment-images/20191001132016-untitled/05c05f7b-6c38-4a06-a3b0-145de226c8ad/20200211215757-RCNX0103.JPG | etc.
Can it be used to link to IDs in a multimedia extension?
04cf6dfe-d954-4687-89a2-0155abd3f694 | 6c636a01-f14f-47eb-8e47-57e8cdf12a5a | etc.
Can it be used to link to media URLs
Yes, so you could have event core and do this in occurrence extensions. The downside is you miss image metadata (including the ability to convey a license which is usually the blocker)
Can it be used to link to IDs in a multimedia extension?
No
Any feedback from others what approach to take? What do we value more for camtrap data at GBIF/OBIS? Deployments (including empty ones) or image metadata (including license).
If it's that's the choice we have to make then I think the image metadata wins.
It's disappointing to exclude the deployment info. I'm assuming that a frictionless data package would allow us to break the star schema and allow more complex relationships, but we're a while off GBIF/OBIS support for that. So in making this recommendation we'd be explicit about the original dataset providing the full view.
Maybe this is a stretch, but what about using dcterms:rights
in the occurrence extension if the image license is the main blocker?
@pieterprovoost dcterms:rights
is generally interpreted as applying to the data, not the images.
Overall, I think the general consensus is that although we don't like losing deployment information, it is still better to express the images (a very important part of camera trap data) in an extension, rather than trying to cram these in associatedMedia
.
I have now updated the README with this summary and referenced the camera trap best practices guide.
It sounds to me like the only way we will be able to utilise the multimedia extension with an event core is if it undergoes a similar process as the eMoF to associate to both an occurrence and an event.
Another question that comes up is whether the star schema is capable of having more than two extension files relate back to a core. Ideally, I would like to represent my data with a combination of Event Core and Occurrence, eMoF, and Multimedia extensions. Can the multimedia fields be contained within the eMoF extension to circumvent this blocker?
@ssmosstee on your first point: you are right, an eMoF like multimedia extension would allow us to have an event core. Now having published camtrap data to GBIF: https://doi.org/10.15468/5tb6ze, I'm not sure we need an Event core. By adding an eventID
to the occurrence core, GBIF will group those, so the deployments are retained. The only information we can't express is the deployment start/end, but that information would be overwritten by the more precise occurrence eventDate
anyway when flattening event + occurrences.
On two: yes, you relate many extension files to your core. Checklists are a good example of that: a taxon core + vernacular name extension, distribution extension, ... all linking back to your taxon core. For your use case: if you only want to relate the MoF and multimedia to occurrences, you're better off with an Occurrence core. If you want to relate to events and occurrences, it does get complicated.
I first mapped camtrap data to Darwin Core as an Event core (deployment) + Occurrence extension (observations): see here for the output.
But because of the star schema, that doesn't allow to map images to occurrences, except maybe with the extended schema, which is not supported by GBIF. As advised by @timrobertson100 I have now mapped the data as an Occurrence core (observations) + Multimedia extension (images), which allows to include the evidence for the occurrence. The only information we lose is the deployment start/end + deployments that did not generate (animal) observations. But that could always be found in the original data.