tdwg / camtrap-dp

Camera Trap Data Package (Camtrap DP)
https://camtrap-dp.tdwg.org
MIT License
45 stars 5 forks source link

Is camtrap-dp:taxonID a scientificNameID or taxonID? #340

Closed peterdesmet closed 1 year ago

peterdesmet commented 1 year ago

The definition of Camtrap DP's taxonID is:

Identifier of the scientificName as defined in package.taxonomic.taxonID for that scientific name.

In package.taxonomic.taxonID the identifier has the following definition:

Unique identifier of the taxon according to the taxonomic reference list defined by taxonIDReference.

The term acts as a unique identifier for the following terms in taxonomy:

What is the good Darwin Core equivalent of this term to included in occurrence data?

  1. https://dwc.tdwg.org/terms/#taxonID
  2. https://dwc.tdwg.org/terms/#dwc:scientificNameID
  3. https://dwc.tdwg.org/terms/#dwc:taxonConceptID

See also this discussion.

peterdesmet commented 1 year ago

This also affects the camera trap publication guide and camtraptor.

peterdesmet commented 1 year ago

@mdoering what would you advise here? If I can distill your comments in this discussion, you would advise to use:

Correct?

mdoering commented 1 year ago

As cpt:taxonID appears to define taxonomic values such as the classification it is per definition a Taxon identifier, not one for the name alone. I would therefore think https://dwc.tdwg.org/terms/#taxonID is the corresponding term in DwC.

The question then is what taxon identifiers exist that you can reuse? There hardly are taxonomic identifiers, but mostly name ids as I've mentioned in the above discussion.

The identifier COL QLXL when not used with a specific release really is a name based identifier. The classification of the name can change between versions of the COL checklist, but not the name. If you want to unambiguously reference the taxonomy of that name in COL you currently would have to use a specific release such as https://www.checklistbank.org/dataset/9923/taxon/QLXL or COL:9923:QLXL as a simpler identifier that does not rely on a URL and resolution. COL is also working towards true stable taxon identifiers, but that will still take a while.

mdoering commented 1 year ago

camtrap-dp NAME information:

camtrap-dp TAXON information:

peterdesmet commented 1 year ago

Thanks for the reply @mdoering.

  1. I follow your thinking that we intent in Camtrap DP is a Taxon Identifier (and thus a dwc:taxonID). It acts as a link between the observation and the full taxonomic information in package.taxonomic, so we don't have to repeat the full taxonomic information for every observation record.
  2. I think we should update the definition of taxonID slightly from:

    Identifier of the scientificName as defined in package.taxonomic.taxonID for that scientific name.

    To:

    Identifier of the taxon of the scientific name. Foreign key to package.taxonomic.taxonID.

    1. While we use QLXL as identifier, it is implied to be interpreted in combination with the required term taxonIDReference which in the example dataset is https://www.checklistbank.org/dataset/3LR. @mdoering is that fine as a reference? Or should we use your suggested https://www.checklistbank.org/dataset/9923 instead? Note that users may populate this with whatever URL, so it won't always result in a specific version for the taxonomy.
mdoering commented 1 year ago

Agree with all Peter. Just not that the URL to 3LR is like a redirect to the latest current version and thus the content behind it does change over time (monthly here). The one with a fixed dataset key does not. I guess it depends on your intention which one to use.

For long time stable annual releases of COL we also provide these URLs: https://www.checklistbank.org/dataset/COL2022

peterdesmet commented 1 year ago

Thanks Markus! I've made the necessary changes in #352. I will use https://www.checklistbank.org/dataset/COL2023 in the example dataset