HakaiInstitute / hakai-datasets

Hakai Datasets that are going into https://catalogue.hakai.org/erddap/
0 stars 0 forks source link

Generate DOI for some Hakai Research Datasets #70

Closed JessyBarrette closed 2 years ago

JessyBarrette commented 2 years ago

DOI Generation for some Hakai Research Datasets

Issue description

The following Hakai datasets will need to have a DOI generated. Some of them are not quite yet available within the Hakai CKAN system and would also need to be made available there:

Dataset Hakai Metadata Record CKAN DOI
Hakai Nutrient Research Metadata CKAN
Hakai Chlorophyll Research Metadata CKAN https://doi.org/10.21966/wsvt-ew96
Hakai Water Properties Profiles Research Metadata CKAN https://doi.org/10.21966/6cz5-6d70
Dosser et al. (2021) Hakai Nutrients Metadata CKAN https://doi.org/10.21966/j3j5-wt70

** This table will be updated as the different components are available.

JessyBarrette commented 2 years ago

@timvdstap Once we have the different datasets available within the Hakai CKAN website. It would be good to generate a DOI for each individual dataset.

JessyBarrette commented 2 years ago

Hi @timvdstap, I just added the Dosser et al. dataset to the table above for all the hakai datasets. All those would need a DOI attached to the Hakai CKAN record. Let me know when this is available!

timvdstap commented 2 years ago

Hey @JessyBarrette -- I'll pass this request on to @Br-Johnson if that's OK. Brett's more experienced with minting DOIs!

@Br-Johnson any chance you can help Jessy out with this?

Br-Johnson commented 2 years ago

No problem. The CKAN record for the nutrient dataset says it's deleted...

Just wondering why in the titles of the CKAN datasets do they end in ',Research'? I noticed there are some ',Provisional' records as well. What is the strategy there?

I will mint DOIs for Chlorophyll, Water Properties, and Dosser et al. now.

Br-Johnson commented 2 years ago

OK, I've updated the DOI table above with the appropriate DOIs

JessyBarrette commented 2 years ago

Awesome thanks Brett!

On Mon, Dec 13, 2021 at 7:31 PM Brett Johnson @.***> wrote:

OK, I've updated the DOI table above with the appropriate DOIs

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HakaiInstitute/hakai-datasets/issues/70#issuecomment-993038715, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHICYOJTB6B4AHFPCHEIIYDUQ2F7BANCNFSM5ICKKZGA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- Jessy Barrette M.Sc. Marine Instrumentation Specialist Hakai Institute https://www.hakai.org/ | @.*** | (C) (250) 208-7806

JessyBarrette commented 2 years ago

Thanks the DOIs were now added to the CKAN metadata records. I will also update the ERDDAP datasets to include the DOIs by following the ACDD convention {"id" and "naming_authority" global attributes).

JessyBarrette commented 2 years ago

Let's track the DOI update to the ERDDAP datasets here:

Br-Johnson commented 2 years ago

We should consider how we implement DOIs in ERDDAP. Ocean Networks Canada and Scripps use a non-standard DOI global attribute. Partly, because the datasets already had a separate id that wasn't a DOI. See ONC example here and Scripps example here

Here is an ERDDAP google groups discussion on the matter.

I've done what ACDD recommends (as you point out) as well with IYS datasets on ERDDAP, but I suppose there is no harm in taking both approaches and include a DOI global attribute as well as 'id' and 'naming_authority'.

Thoughts @Pramod? CIOOS Data Management Task Team is discussing how to implement DOIs for CIOOS generally and we'll have to decide on a standard for how to include DOIs...

raytula commented 2 years ago

Thanks for adding this info, @Br-Johnson . Interesting to see where ONC was actively asking about this. If helpful, we could follow-up with Reyna and/or Chantel Ridsdale from ONC, as they have also thought about this quite a bit. It makes sense to me to have the DOI link to the metadata record, as we are doing; however, ONC may be taking a different approach. However, note that Reyna/ONC may be feeling a bit over-extended these days, so we should be careful and reduce parallel threads of activity with ONC. @pramod-thupaki has most recently interacted with Reyna/ONC, so he can perhaps comment on when/if/how to best engage them about this particular topic.

Br-Johnson commented 2 years ago

OK, sounds good @raytula. Just to make sure I'm being clear I'm just talking about the minor detail of which global attribute the DOI is located under in the the ERDDAP .XML file. I don't even know what use-cases this would affect other than if queries for an ERDDAP dataset were made to an ERDDAP server based on trying to find a specific DOI you might have to look in multiple attributes based on the discrepancy I noted above. But I'm not sure when or if anyone would even need to make such a query.

Where the DOI lands is indeed something that ONC does differently though which I think you know. ONC has a 'Dataset Landing Page' with basic metadata about the DOI, whereas we as you know point to CKAN as our landing page.

JessyBarrette commented 2 years ago

I think having a global attribute DOI is alright, it doesn't follow any conventions but also doesn't break any.

If we need to follow a convention like ACDD then populating the id and naming_authority attributes would be a standard way to follow the convention guidelines. We can certainly have the DOI in multiple fields. ACDD also doesn't specify which naming_authority, Hakai would just have one so it's easy otherwise, other groups may want to prioritize.

Linking to CKAN record, I think that makes sense if we interpreted the CKAN record as the master hub and long-term repository for each dataset through its different versions and methods to access (If ERDDAP is dropped in the future, that DOI should point to the newest and greatest ERDDAP 2.0 ...).

raytula commented 2 years ago

@Br-Johnson @JessyBarrette thanks for the additional info/feedback. I withdraw my suggestion to sync up with ONC about this. Happy to proceed in whatever way makes most sense to you guys. Yes, I view the CKAN record as the 'master hub', so makes sense for users to land there,

pramod-thupaki commented 2 years ago

Apologies for the slow response, I have summarized my opinions/findings below ...

Re: DOI as global attribute on ERDDAP - This appears to follow the best practices. As there is no conflict with standards, this seems the way to go. Need to see how this would work with versioning datasets (is this being discussed at the DMPTT ?)

Re: DOI Landing page - using the CKAN record for this is consistent with our practices. Question - what about CKAN records that have multiple ERDDAP datasets (collections?)

Re: syncing with ONC - Doesn't seem necessary at this point. We should continue working with ONC and rest of CIOOS on DMPTT ... connecting to individuals in ONC might be more productive if we have something specific we want them to comment on - in an offline manner (shared docs and such). The ONC wiki (here) has details on their DOI practices and future plans.

pramod-thupaki commented 2 years ago

Here is another resource we can help to guide our plans - RDA Guidelines for Making Data Citable

Note - not for light reading !