tdwg / dwc-qa

Public question and answer site for discussions about Darwin Core
Apache License 2.0
49 stars 8 forks source link

Question to logical structure of DarwinCore classes and properties #189

Open herb52 opened 2 years ago

herb52 commented 2 years ago

Hello,

I am new to DarwinCore (DwC). I want to store DwC-specific data inside photos using the structures and tags (elements of the structure) of xmp-dwc supported by Exiftool. For xmp-dwc please see https://exiftool.org/TagNames/DarwinCore.html

I started with "Darwin Core quick reference guide" on https://dwc.tdwg.org/terms/ Here I see

classes with properties listed for this classes: e.g.: Record-level with properties: datasetID, collectionCode and
many others.  We have a one-to-one relationship between these
.. classes and the ExifTool structures and
.. the properties_of_classes to elements/tags_of_structures.
Looking on stored data we have always the following structure-tree
DwC_class / ExifTool_structure
!
!--- property / tag: stored DwC-data

also classes without (explicitely given) properties
Here I miss the one-to-one relationship.

(A) Reading "Darwin Core quick reference guide" my impression is that e.g. data for "PreservedSpecimen" or
     "MaterialCitation" is stored like
DwC_class: stored DwC-data
e.g:
PreservedSpecimen: "A plant on an herbarium sheet. A cataloged lot of fish in a jar"
MaterialCitation: "A citation of a physical specimen from a scientific collection in ..."

(B) Looking on Exiftool side I see the following tree
ExifTool_structure
!
!--- tag
e.g:
PreservedSpecimen
!
!---MaterialSampleID: xxx-data
(MaterialCitation is missing at the moment).

Now my question for the "logical structure" is: Is (A) or (B) to be used in order to store DwC-data properly?

Maybe an example that shows one of the "classes without properties" would be helpful.

Remark: I asked this question also on tdwg-dwc https://github.com/tdwg/dwc/issues/414 , but without success. So I hope to get an answer/comment here.

Thanks for your comments/answers in advance Best regards herb

tucotuco commented 1 year ago

@herb52 Sorry for the delay. I would like to help you resolve this issue, but I do not understand it very well and this might benefit from a call. You can contact me by email to set that up.

herb52 commented 1 year ago

Hello Tucotuco,

thanks for your reply. Yes you are right, my question is in some points too in general and in others to detailed so that it is confusing for you. Sorry. So please allow me to start from scratch again.

My question is about: How to store DarwinCore information/data inside still-images in XMP-format (using ExifTool to write it).

(1) My knowledge about DarwinCore is based on your internet page https://exiftool.org/TagNames/DarwinCore.html

(2) ExifTool is a Perl application to read and write metadata from/into images (or videos). ExifTool is created and maintained by Phil Harvey; it is able to manage Exif, Iptc-IIM and also XMP metadata. As part of XMP it can read/write DarwinCore metadata (with namespace xmp-dwc). These xmp-dwc metadata have been defined many years ago based on an example-image-file sent by an user to Phil Harvey (as far I know). Current supported tags is decribed on https://exiftool.org/TagNames/DarwinCore.html

(3) Let me explain my problem with the following example. On page "Darwin Core XML guide" https://dwc.tdwg.org/xml/ chapter 2.7.1 we find an example of a DarwinCore file (+) in XML

For DarwinCore class Event I created an XMP-file with ExifTool: <?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>

1949-09-02 525813 2 Sep 1949 (This XMP-block will be written into an image-file) Looking at "Darwin Core quick reference guide" https://dwc.tdwg.org/terms/ "Event" is a DarwinCore class with many properties. 3 properties I used with my example above. In ExifTool this class "Event" is assigned to structure "Event" and each property (e.g. "eventDate") is assigned to an element (here also "eventDate"). The XMP-content is similar/identical to the content of the XML-file. Therefore I think that ExifTool uses a proper format to store data. (4) But on the internet page we also find e.g. class "PreservedSpecimen" and for this class no property is listed. In XML-file (+) the string "PreservedSpecimen" is only listed as "value"; but not as "class": PreservedSpecimen Now my question is: how to represent this class inside my above given XMP-file. a) is it also a structure: but which element(s) does it have? exmaple-value for... b) or is it a single-tag (something like a structure without any element) example-value for... c) or is it to be used only as "value" - as a word of a managed vocabulary d) or what is it? (I don't know) Sorry that the explanation has become so long; but I hope to be more clear now. Thanks for your comments and help in advance. Best regards herb
bart-v commented 1 year ago

Currently it's mostly used as answer c) "PreservedSpecimen" is a vocabulary term for the property "dwc:basisOfRecord" i.e. used in the "dwc:Event" Class See https://rs.gbif.org/vocabulary/dwc/basis_of_record_2022-02-02.xml

herb52 commented 1 year ago

Hello bart-v

Thanks for your reply. The dictionary you mentioned contains strings/names of "classes" listed on "Darwin Core quick reference guide" -- all "classes" without properties -- but also 2 classes that have properties: MaterialSample and Occurence

So I think it is not the class, but only names with identical strings.

As you said: Currently it's mostly used as answer c) I agree, because I did not find any example XML-file that has "PreservedSpecimen" used as "class".

So what is "PreservedSpecimen" (and others) designed for". To be used in future?

Best regards herb

bart-v commented 1 year ago

Good question, I have no idea :)