NCEAS / metacat

Data repository software that helps researchers preserve, share, and discover data
https://knb.ecoinformatics.org/software/metacat
GNU General Public License v2.0
26 stars 12 forks source link

support multiple DOI shoulders #1514

Closed mbjones closed 3 years ago

mbjones commented 3 years ago

Metacat deployments currently support only a single DOI shoulder. However, because EZID shoulders can be configured as either available for mint() or create() operations, and not both, some metacat deployments may want to use multiple shoulders, one for minting DOIs (e.g., through MetacatUI with generateIdentifier()), and one for creating assigned DOIs (e.g., through an external client like R). In both cases, we would like Metacat to detect changes (e.g., to metadata and system metadata), and at the necessary times update the external DOI service (EZID or DataCite) with the new information. This is typically done when an object is first created, but could also happen if a DOI is assigned as a SID and the object is updated, necessitating an update to the DOI metadata when the update occurs. This feature has been requested by PISCO.

Looking at the code, we only support a single shoulder, although the EZID library itself is not constrained in that way. To fix this, I think it is a matter of 1) configuring multiple shoulders that Metacat should monitor for changes and register updates through EZID, and 2) ensure that one of these is the 'primary' shoulder that should be used for minting operations. I propose the following changes to config parameters:

These secondary shoulders numbered 2 and beyond would not be used for automated minting using the DataONE generateIdentifier() API call. If that were desired, we could also extend the method to enable minting in multiple shoulders using the fragment identifier from generateIdentifier, but that is not strictly needed for the PISCO use case, and so shoudl be entered in a separate ticket if desired.

The code where this is all handled is in edu.ucsb.nceas.metacat.dataone.DOIService.

mbjones commented 3 years ago

@taojing2002 I took a quick pass at this in branch feature_multiple_shoulders, although I was unable to test it. Does this look reasonable to you as an approach? If so, could you consider folding it into the next release once you've been able to test it? The current tests should run, but then we could probably use more tests that exercise the use of multiple shoulders.

taojing2002 commented 3 years ago

After getting multiple shoulders from Rushi, I tested the code and it worked. So the feature branch has been merged into the develop branch.