gbif / doc-publishing-dna-derived-data

This guide shows how to publish DNA-derived spatiotemporal biodiversity data and make it discoverable through national and global biodiversity data discovery platforms. Based on experiences from Australia, Norway, Sweden, UNITE, and GBIF.
https://doi.org/10.35035/doc-vf1a-nr22
Other
2 stars 7 forks source link

DNA Derived Data extension and EMoF extension are incompatible #171

Closed albenson-usgs closed 1 year ago

albenson-usgs commented 1 year ago

Capture

The highlighted sentence is not accurate. It is not possible to use the DNA Derived Data extension and the Extended Measurement or Fact extension together. The DNA Derived Data extension requires Occurrence Core whereas the Extended Measurement or Fact extension requires the Event Core. They cannot both be used at the same time.

timrobertson100 commented 1 year ago

Thanks @albenson-usgs

There are a few things here to consider.

Using an event core is the most common use, or perhaps even the only use of the Extended Measurement or Fact extension. It does state in its definition the following though:

This extension (eMoF) was developed to be used in combination with the Event Core, but is also compatible with other cores

~(it makes little sense to use with the Occurrence core though)~ (@pieterprovoost corrects this below)

The DNA Derived Data extension is not restricted to Occurrence core, defined as:

An extension to Occurrence and Event cores to capture information relating to DNA.

Although the document advises:

the current recommendation is to publish data as Occurrence core (Category I or II) with the DNA derived data extension

With that, I think the document is technically accurate but since it advises using the occurrence core, we might consider if it should be phrased differently - i.e. explaining that it could be used with Event core and EMoF even if not recommended.

I presume you've come across this because of a real word scenario - could you suggest new phrasing please or perhaps elaborate on the issue you're encountering?

Thanks!

thomasstjerne commented 1 year ago

If the actual sequence is omitted in the DNA extension, it is well suited for Event Core, and also in conjunction with EMOF. However, this makes it impossible to share sequences and is therefore not the current recommendation. All other attributes than DNA_sequence in the extension relates to the sample/event.

timrobertson100 commented 1 year ago

Perhaps we should add something in the doc like this to help clarify things?

The recommendation to use the Occurrence core stems from the strong desire to share the sequence to help qualify the determination. If the sequence is not to be shared, the extension may be used with the event core

pieterprovoost commented 1 year ago

@albenson-usgs Just to clarify, ExtendedMeasurementOrFact can be used with Occurrence core, for example to make use of the added identifier fields.

albenson-usgs commented 1 year ago

Apologies for my confusion! I will close this ticket as I didn't understand this properly.

timrobertson100 commented 1 year ago

I think you've identified something that could be clearer in the doc though so I'll open it again so it stays on the list of things to improve in the next edition.

Thanks @albenson-usgs

Just to clarify, ExtendedMeasurementOrFact can be used with Occurrence core, for example to make use of the added identifier fields.

Thanks @pieterprovoost - I wasn't aware of that use.

tobiasgf commented 1 year ago

I added some explanation based on the above to the revised version.