ckan / ideas

[DEPRECATED] Use the main CKAN repo Discussions instead:
https://github.com/ckan/ckan/discussions
40 stars 2 forks source link

DOI Support - Ability to assign DOIs to datasets #6

Open rufuspollock opened 10 years ago

rufuspollock commented 10 years ago

Support for assigning DOIs to datasets in a CKAN instance and displaying this appropriately to encourage use in citation.

Original item (with lots of discussion) at: https://trello.com/c/dlzGQGdH/98-discovery-ability-to-assign-dois-datacite-to-datasets

nigelbabu commented 10 years ago

Sorry but what is a DOI? I skimmed through the discussion on the original item and that didn't help much.

rossjones commented 10 years ago

@nigelbabu tldr: Digital Object Identifier - used by academics in citations - want each dataset to have a DOI to make it easier to cite sources. see https://en.wikipedia.org/wiki/Digital_object_identifier

rufuspollock commented 10 years ago

@nigelbabu this is a common need for any research-oriented datahub / data portal as people want a standard way to cite the datasets.

Aaron-M commented 10 years ago

Note, Landcare Research NZ is funding some OKFN support time to build this functionality.

rufuspollock commented 10 years ago

@Aaron-M Are there any user stories for this? Would be nice to have them set out in some detail ...

Aaron-M commented 10 years ago

Below are some user stories off the top of my head. There are issues I'm not 100% clear on what I want/how to deal with e.g. granularity (how and whether to issue DOI at the resource level, and.or at the DataSet level.

At the moment I ask the researcher to complete some metadata (a few fields are mandatory and these basically align with CKAN, with couple of tweaks/additions), some is optional but can be good to record if the researcher is willing to provide) - I gather this in a word document, and then copy and paste into the EZID web interface to issue the DOI, then I enter the DOI into CKAN in a 'extra' field. Sometimes the DOI will be 'reserved' (not publicly available) until the dataset is ready to go live (be made public), in which case I go back in and activate the DOI (either using their API, or now with revamped web interface).

Example user stories: As a researcher I want a DOI for my data so I can cite it in my paper, and get credit/acknowledgement for it (when used/cited by others).

As a Data Repository Administrator I want DOI requests to come to me (or designated Admin(s)) for approval so I can ensure the metadata required by the DOI issuing agency is provided and of acceptable quality before allocating a DOI.

As a Data Repository Administrator I want an easy way to submit the metadata provided with a dataset (or resource) to the DOI minting 'engine' so I don't need to retype or copy/paste.

As a Data Repository Administrator I want to be able to assign a DOI at the Dataset or the Resource level, depending on the nature of the data and the DOI request. [This needs some thinking]

As a Data Repository Administrator I want to be notified if a dataset or resource is changed after a DOI has been assigned. AND/OR I want to be able to lock the dataset/resource so it cannot be altered (or deleted) without my permission (or a nominated Admin) after a DOI has been assigned.

makkus commented 9 years ago

Is anybody currently working on this? I've seen the ckanext-doi project created by NHM, but that seems to only support the Datacite API. We are using EZID. Might not be too hard to add EZID support for that, so I'll give it a go. Just wondering whether somebody already tried...

florianm commented 9 years ago

I've spoken to the Australian National Data Service (ANDS) who are willing to offer DOI minting. I can't promise it'll be free of charge but for our use case (WA govt data) it might be.

The discussion point re ANDS would be policy of minting DOIs (per dataset, per resource, per resource revision?) and cost if any.

The remaining task is to develop ckanext-doi from using datacite into using ANDS as well (service and credentials via config). I've got that on my wish list as soon as I've got playtime left; happy to collaborate and communicate though.

Getting a citation metric back shown on datasets with a DOI would be a nice to have. Surely there's an API we could ping from the package_read template.

Aaron-M commented 9 years ago

We are working with the CKAN developers (contact Jo Barratt) at present on a DOI module that works with the EZID service (which has a good API, and we are signed up to issue DOIs through). We are setting it up and wil test it/install on our system in due course, but the idea would be that other users would need to have their own EZID account in order to issue their own DOIs. ANDS providing that service for the whole of Australian Research Orgs is a slightly different use case, but the underlying functionaility we are working on could be shared?

makkus commented 9 years ago

Yes, that'd be good. I was thinking along the lines of having a generic '(persistent)Id' base-plugin, which could be extended to connect to different 'pid' providers (EZID, datacite, a random handle server, maybe even a 'normal' url or bit.ly or similar -- even though of course a 'normal' url would not be really persistent, but it could be useful anyway).

grhammo commented 8 years ago

Hi all, any updates on this one? My organisation would really like to provide a DOI for each dataset, so that we can ensure research citation is represented accurately.

makkus commented 8 years ago

My colleague @neon-ninja is now working on a plugin to add EZID DOIs to CKAN resources:

https://github.com/UoA-eResearch/ckanext-ezid

Actually, it's more than one plugin. There is a base one ( https://github.com/UoA-eResearch/ckanext-external_id ) that allows for generic ids to be added to a resource, and which needs to be extended by id-type-specific plugins like the one above. There is also an example implementation of such a plugin that can do url-shortening via goo.gl: https://github.com/UoA-eResearch/ckanext-googl

Feel free to contribute, or develop your own doi-provider specific plugin (if you are not using EZID).

brew commented 8 years ago

Great to see some work on this. We (Open Knowledge) have also been working on an EZID based DOI extension for datasets: https://github.com/okfn/ckanext-doi .

It provides an additional form field for automatically generating an DOI for a dataset and submitting to EZID when the dataset is published. Or the DOI can be manually inputted. There's a setting to define which roles within an organization can publish a DOI, so for example it can be restricted to org admins only. There's also support for selecting from a list of predefined prefixes (shoulders).

Psykar commented 8 years ago

For those in Australia, I've just uploaded an extension to work with ANDS for minting DOIs

http://pypi.python.org/pypi/ckanext-ands https://github.com/Psykar/ckanext-ands