NASA-PDS / doi-service

Service and tools for generating DOIs for PDS bundles, collections, and data sets
https://nasa-pds.github.io/doi-service
Other
2 stars 3 forks source link

pds-doi-init loads invalid state from DataCite #376

Closed alexdunnjpl closed 1 year ago

alexdunnjpl commented 1 year ago

πŸ› Describe the bug

When attempting to bulk-update all findable DOIs, the following two rows are among those loaded into the internal transaction db:

10.17189/0dwh-5k25,urn:nasa:pds:jupiter-csheet-mod-khurana2022::2.0,findable,Khurana Jupiter Current Sheet Structure Model 2022,pds-operator@jpl.nasa.gov,Collection,PDS4 Refereed Data Bundle,ppi,1657549413,1657834313,/media/psf/Home/dev/doi-service/venv/transaction_history/ppi/10.17189/0dwh-5k25/2022-07-14T21:31:53+00:00,1
10.17189/p9sv-hk11,urn:nasa:pds:jupiter-csheet-mod-khurana2022::3.0,findable,Khurana Jupiter Current Sheet Structure Model 2022,pds-operator@jpl.nasa.gov,Collection,PDS4 Refereed Data Bundle,ppi,1661296277,1664563717,/media/psf/Home/dev/doi-service/venv/transaction_history/ppi/10.17189/p9sv-hk11/2022-09-30T18:48:37+00:00,1

This results in failure of the bulk-update, with

DuplicatedTitleDOIException : The title 'Khurana Jupiter Current Sheet Structure Model 2022' has already been used for records urn:nasa:pds:jupiter-csheet-mod-khurana2022::2.0, status: review, doi: 10.17189/0dwh-5k25. A different title should be used.
If you want to bypass this check, rerun the command with the --force flag provided.
INFO pds_doi_service.core.entities.exceptions:raise_or_warn_exceptions Since the force flag was used, the previous warning is ignored
Traceback (most recent call last):
  File "/media/psf/Home/dev/doi-service/src/pds_doi_service/core/outputs/web_client.py", line 89, in _submit_content
    response.raise_for_status()
  File "/home/parallels/dev/doi-service/venv/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://api.datacite.org/dois/10.17189/1519059

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/media/psf/Home/dev/doi-service/src/pds_doi_service/core/actions/release.py", line 286, in run
    output_doi, o_doi_label = self._web_client.submit_content(
  File "/media/psf/Home/dev/doi-service/src/pds_doi_service/core/outputs/datacite/datacite_web_client.py", line 93, in submit_content
    response_text = super()._submit_content(
  File "/media/psf/Home/dev/doi-service/src/pds_doi_service/core/outputs/web_client.py", line 95, in _submit_content
    raise WebRequestException(
pds_doi_service.core.entities.exceptions.WebRequestException: DOI submission request to DataCite service failed, reason: 401 Client Error: Unauthorized for url: https://api.datacite.org/dois/10.17189/1519059
Details: '{"errors":[{"status":"401","title":"Bad credentials."}]}'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/parallels/.config/JetBrains/PyCharm2022.2/scratches/scratch_10.py", line 122, in <module>
    released_record_label = release_action.run(**release_action_kwargs)
  File "/media/psf/Home/dev/doi-service/src/pds_doi_service/core/actions/release.py", line 322, in run
    raise CriticalDOIException(str(err))
pds_doi_service.core.entities.exceptions.CriticalDOIException: DOI submission request to DataCite service failed, reason: 401 Client Error: Unauthorized for url: https://api.datacite.org/dois/10.17189/1519059
Details: '{"errors":[{"status":"401","title":"Bad credentials."}]}'

It's unclear whether this is pds-doi-service producing erroneous records upon initialization, or invalid state existing in our collection of DOIs in DataCite.

image

πŸ“œ To Reproduce

Steps to reproduce the behavior:

  1. Truncate internal db
  2. Set config to use user NASAPDS.NASAPDS and doi_prefix 10.17189
  3. run pds-doi-init
  4. attempt bulk-update of all DOIs (with no actual mutation of the DOIs pulled from DataCite, since y'know, it's production data)

πŸ•΅οΈ Expected behavior

Unclear. @jordanpadams wanna weigh in?

alexdunnjpl commented 1 year ago

Nevermind - the conflicting titles are not the cause of the exception (thought DataCite might be returning a bad HTTP code again).