HakaiInstitute / hakai-datasets

Hakai Datasets that are going into https://catalogue.hakai.org/erddap/
0 stars 0 forks source link

Add DOI attribute to CTD Profile and Chlorophyl Datasets #72

Closed JessyBarrette closed 2 years ago

JessyBarrette commented 2 years ago

Description

Objective: This PR is to define and discuss the approch used by Hakai to represent DOI associated to a datasets in ERDDAP. The present exercise is for two research datasets within Hakai (https://github.com/HakaiInstitute/hakai-datasets/issues/32 and https://github.com/HakaiInstitute/hakai-datasets/issues/8)

DOI landing page: As agreed, each individual DOI will link to the CKAN Metadata record associated with the ERDDAP datasets.

All this is related to the discussion in issue #70

Conventions

Within CIOOS - ONC As mentioned by @Br-Johnson, ONC started using the the DOI attribute and the full url (ex: https://doi.org/10.21966/wsvt-ew96 ) based on some discussion within the ERDDAP group (see here).

    <att name="DOI">https://doi.org/10.21966/wsvt-ew96</att>

The capitalized term DOI may not be appropriate (most conventions used all lower case attributes) and although it's handy in ERDDAP to have the full link given, I haven't seen any other examples that are using it. Most uses just the identifier part (e.g. 10.21966/wsvt-ew96). This is small details but it can be annoying when this starts getting used widely.

CF Convention An issue to suggest the doi attribute without the proxy part (e.g. 10.21966/wsvt-ew96) was posted few years ago (2019) on the CF convention Github repository (see here). Unfortunately, the long discussions never came to an agreement and the issue remained open up to this date. A few ERDDAP groups are however following this suggested convention:

    <att name="doi">10.21966/wsvt-ew96</att>

ACDD Convention 1.3 ACDD has two RECOMMEND attributes related to this topic:

Attribute Description
id An identifier for the data set, provided by and unique within its naming authority. The combination of the "naming authority" and the "id" should be globally unique, but the id can be globally unique by itself also. IDs can be URLs, URNs, DOIs, meaningful text strings, a local key, or any other unique string of characters. The id should not include white space characters.
naming_authority The organization that provides the initial id (see above) for the dataset. The naming authority should be uniquely specified by this attribute. We recommend using reverse-DNS naming for the naming authority; URIs are also acceptable. Example: 'edu.ucar.unidata'.

I haven't found any example of those attributes used for a DOI anywhere yet, but I would suggest using the following:

    <att name="id">10.21966/wsvt-ew96</att>
    <att name="naming_authority">org.doi</att>

Suggestion Used by Hakai

Based on all presented below, I would suggest using the following standards:

Finally in ERDDAP dataset.xml those would be represented as:

    <att name="infoURL">http//doi.org/10.21966/wsvt-ew96</att>
    <att name="doi">10.21966/wsvt-ew96</att>
    <att name="id">10.21966/wsvt-ew96</att>
    <att name="naming_authority">org.doi</att>

@raytula @pramod-thupaki @timvdstap @Br-Johnson Let us know your thoughts on the subject!

JessyBarrette commented 2 years ago

I will merge this and we can review the standard method to present DOIs within the ERDDAP dataset as part of the CIOOS metadata form yaml to erddap xml conversion tool here https://github.com/cioos-siooc/cioos-yaml-to-erddap/pull/3