Open rufuspollock opened 10 years ago
Sorry but what is a DOI? I skimmed through the discussion on the original item and that didn't help much.
@nigelbabu tldr: Digital Object Identifier - used by academics in citations - want each dataset to have a DOI to make it easier to cite sources. see https://en.wikipedia.org/wiki/Digital_object_identifier
@nigelbabu this is a common need for any research-oriented datahub / data portal as people want a standard way to cite the datasets.
Note, Landcare Research NZ is funding some OKFN support time to build this functionality.
@Aaron-M Are there any user stories for this? Would be nice to have them set out in some detail ...
Below are some user stories off the top of my head. There are issues I'm not 100% clear on what I want/how to deal with e.g. granularity (how and whether to issue DOI at the resource level, and.or at the DataSet level.
At the moment I ask the researcher to complete some metadata (a few fields are mandatory and these basically align with CKAN, with couple of tweaks/additions), some is optional but can be good to record if the researcher is willing to provide) - I gather this in a word document, and then copy and paste into the EZID web interface to issue the DOI, then I enter the DOI into CKAN in a 'extra' field. Sometimes the DOI will be 'reserved' (not publicly available) until the dataset is ready to go live (be made public), in which case I go back in and activate the DOI (either using their API, or now with revamped web interface).
Example user stories: As a researcher I want a DOI for my data so I can cite it in my paper, and get credit/acknowledgement for it (when used/cited by others).
As a Data Repository Administrator I want DOI requests to come to me (or designated Admin(s)) for approval so I can ensure the metadata required by the DOI issuing agency is provided and of acceptable quality before allocating a DOI.
As a Data Repository Administrator I want an easy way to submit the metadata provided with a dataset (or resource) to the DOI minting 'engine' so I don't need to retype or copy/paste.
As a Data Repository Administrator I want to be able to assign a DOI at the Dataset or the Resource level, depending on the nature of the data and the DOI request. [This needs some thinking]
As a Data Repository Administrator I want to be notified if a dataset or resource is changed after a DOI has been assigned. AND/OR I want to be able to lock the dataset/resource so it cannot be altered (or deleted) without my permission (or a nominated Admin) after a DOI has been assigned.
Is anybody currently working on this? I've seen the ckanext-doi project created by NHM, but that seems to only support the Datacite API. We are using EZID. Might not be too hard to add EZID support for that, so I'll give it a go. Just wondering whether somebody already tried...
I've spoken to the Australian National Data Service (ANDS) who are willing to offer DOI minting. I can't promise it'll be free of charge but for our use case (WA govt data) it might be.
The discussion point re ANDS would be policy of minting DOIs (per dataset, per resource, per resource revision?) and cost if any.
The remaining task is to develop ckanext-doi from using datacite into using ANDS as well (service and credentials via config). I've got that on my wish list as soon as I've got playtime left; happy to collaborate and communicate though.
Getting a citation metric back shown on datasets with a DOI would be a nice to have. Surely there's an API we could ping from the package_read template.
We are working with the CKAN developers (contact Jo Barratt) at present on a DOI module that works with the EZID service (which has a good API, and we are signed up to issue DOIs through). We are setting it up and wil test it/install on our system in due course, but the idea would be that other users would need to have their own EZID account in order to issue their own DOIs. ANDS providing that service for the whole of Australian Research Orgs is a slightly different use case, but the underlying functionaility we are working on could be shared?
Yes, that'd be good. I was thinking along the lines of having a generic '(persistent)Id' base-plugin, which could be extended to connect to different 'pid' providers (EZID, datacite, a random handle server, maybe even a 'normal' url or bit.ly or similar -- even though of course a 'normal' url would not be really persistent, but it could be useful anyway).
Hi all, any updates on this one? My organisation would really like to provide a DOI for each dataset, so that we can ensure research citation is represented accurately.
My colleague @neon-ninja is now working on a plugin to add EZID DOIs to CKAN resources:
https://github.com/UoA-eResearch/ckanext-ezid
Actually, it's more than one plugin. There is a base one ( https://github.com/UoA-eResearch/ckanext-external_id ) that allows for generic ids to be added to a resource, and which needs to be extended by id-type-specific plugins like the one above. There is also an example implementation of such a plugin that can do url-shortening via goo.gl: https://github.com/UoA-eResearch/ckanext-googl
Feel free to contribute, or develop your own doi-provider specific plugin (if you are not using EZID).
Great to see some work on this. We (Open Knowledge) have also been working on an EZID based DOI extension for datasets: https://github.com/okfn/ckanext-doi .
It provides an additional form field for automatically generating an DOI for a dataset and submitting to EZID when the dataset is published. Or the DOI can be manually inputted. There's a setting to define which roles within an organization can publish a DOI, so for example it can be restricted to org admins only. There's also support for selecting from a list of predefined prefixes (shoulders).
For those in Australia, I've just uploaded an extension to work with ANDS for minting DOIs
http://pypi.python.org/pypi/ckanext-ands https://github.com/Psykar/ckanext-ands
Support for assigning DOIs to datasets in a CKAN instance and displaying this appropriately to encourage use in citation.
Colophon
Original item (with lots of discussion) at: https://trello.com/c/dlzGQGdH/98-discovery-ability-to-assign-dois-datacite-to-datasets