NASA-PDS / doi-service

Service and tools for generating DOIs for PDS bundles, collections, and data sets
https://nasa-pds.github.io/doi-service
Other
2 stars 3 forks source link

DOI Service not logging into Datacite to import data #371

Closed jpl-jengelke closed 2 years ago

jpl-jengelke commented 2 years ago

๐Ÿ› Describe the bug identified during I&T

Regression tests started failing at build 82. They continued failing with subsequent releases of doi-service. Basically, the errors involve login problems to DataCite:

...
Field Map 20220108 Bundle
Traceback (most recent call last):
  File "/data/jenkins/workspace/nsole-regression-tests_develop_2/venv/lib/python3.9/site-packages/pds_doi_service/core/outputs/web_client.py", line 89, in _submit_content
    response.raise_for_status()
  File "/data/jenkins/workspace/nsole-regression-tests_develop_2/venv/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://api.test.datacite.org/dois

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/jenkins/workspace/nsole-regression-tests_develop_2/venv/lib/python3.9/site-packages/pds_doi_service/core/actions/reserve.py", line 253, in run
    output_doi, o_doi_label = self._web_client.submit_content(
  File "/data/jenkins/workspace/nsole-regression-tests_develop_2/venv/lib/python3.9/site-packages/pds_doi_service/core/outputs/datacite/datacite_web_client.py", line 81, in submit_content
    response_text = super()._submit_content(
  File "/data/jenkins/workspace/nsole-regression-tests_develop_2/venv/lib/python3.9/site-packages/pds_doi_service/core/outputs/web_client.py", line 95, in _submit_content
    raise WebRequestException(
pds_doi_service.core.entities.exceptions.WebRequestException: DOI submission request to DataCite service failed, reason: 404 Client Error: Not Found for url: https://api.test.datacite.org/dois
Details: ('{"errors":[{"status":"404","title":"The resource you are looking for '
 'doesn\'t exist."}]}')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/jenkins/workspace/nsole-regression-tests_develop_2/venv/bin/pds-doi-cmd", line 8, in <module>
    sys.exit(main())
  File "/data/jenkins/workspace/nsole-regression-tests_develop_2/venv/lib/python3.9/site-packages/pds_doi_service/core/cmd/pds_doi_cmd.py", line 42, in main
    output = action.run(**kwargs)
  File "/data/jenkins/workspace/nsole-regression-tests_develop_2/venv/lib/python3.9/site-packages/pds_doi_service/core/actions/reserve.py", line 288, in run
    raise CriticalDOIException(f"{error_message}.  {explanatory_message}")
pds_doi_service.core.entities.exceptions.CriticalDOIException: DOI submission request to DataCite service failed, reason: 404 Client Error: Not Found for url: https://api.test.datacite.org/dois
Details: ('{"errors":[{"status":"404","title":"The resource you are looking for '
 'doesn\'t exist."}]}').  This error may also be indicative of bad credentials when returned by DataCite.
...

๐Ÿฅผ Related Test Case(s)

doi-service #273

๐Ÿ” : Related issues

N/A


โž• Additional Details

See regression tests for details at https://pds-jenkins.jpl.nasa.gov/view/int/job/pds-doi-console-regression-tests/job/develop/

๐Ÿ“œ To Reproduce

Steps to reproduce the behavior:

  1. Go to PDS-Jenkins Server and see failed regression tests.
  2. Click "Build Now" and view console logs to see errors.

๐Ÿ•ต๏ธ Expected behavior

Login succeeds and data import succeeds, no DataCite errors.

๐Ÿ“š Version of Software Used

doi-service 2.2.1+ (PyPi)

๐Ÿฉบ Test Data / Additional context

DEV password to datasite is used in the DOI Service INI file.

๐ŸžScreenshots

NA, see link above.

๐Ÿ–ฅ System Info


๐Ÿฆ„ Related requirements

โš™๏ธ Engineering Details

alexdunnjpl commented 2 years ago

Last known-working version 2.2.1

Reproducible on fresh install of 2.2.1 with

pds-doi-cmd reserve --input ~/dev/doi-service/test_label.xml --node ENG

Comparison of dependencies yields only pandas 1.4.4->1.5.0 (see #367)

Pinning pandas==1.4.4 and reinstalling doi-service 2.2.1 results in same error

alexdunnjpl commented 2 years ago

Now for the weird bits.

Credentials are confirmed correct via login to https://doi.test.datacite.org

The following request, generated with PostMan, with the body copied from that of the failing reserve action from the previous comment also succeeds.

curl --location --request POST 'https://api.test.datacite.org/dois' \
--header 'Content-Type: application/vnd.api+json' \
--header 'Authorization: Basic <REDACTED>' \
--data-raw '{
    "data":
        {
            "type": "dois",
            "attributes": {
                "event": "hide",
                "prefix": "10.13143",
                "identifiers": [
                    {
                        "identifier": "urn:nasa:pds:blah_test::1.0",
                        "identifierType": "Site ID"
                    }
                ],
                "creators": [
                    {
                        "nameType": "Personal",
                        "name": "J. R. Johnson",
                        "nameIdentifiers": [
                        ]
                    }
                ],
                "titles": [
                    {
                        "title": "blah test 1.0 Bundle",
                        "lang": "en"
                    }
                ],
                "publisher": "NASA Planetary Data System",
                "publicationYear": "2020",
                "subjects": [
                    { "subject": "PDS" },
                    { "subject": "PDS4" }
                ],
                "contributors": [
                    {
                        "nameType": "Organizational",
                        "name": "Planetary Data System: Engineering Node",
                        "contributorType": "DataCurator"
                    }
                ],
                "types": {
                    "resourceTypeGeneral": "Collection",
                    "resourceType": "PDS4 Collection "
                },
                "relatedIdentifiers": [
                ],
                "created": "2022-09-21T21:21:12.964852Z",
                "updated": "2022-09-21T21:21:12.964852Z",
                "state": "draft",
                "language": "en",
                "schemaVersion": "http://datacite.org/schema/kernel-4"
            }
        }
}'
alexdunnjpl commented 2 years ago

This is a requests_toolbelt dump of the failing doi-service request

< POST /dois HTTP/1.1
< Host: api.test.datacite.org
< User-Agent: python-requests/2.28.1
< Accept-Encoding: gzip, deflate
< Accept: application/vnd.api+json
< Connection: keep-alive
< Content-Type: application/vnd.api+json
< Content-Length: 1829
< Authorization: Basic <REDACTED>
< 
< {
    "data":
        {
            "type": "dois",
            "attributes": {
                "event": "hide",
                "prefix": "10.13143",
                "identifiers": [
                    {
                        "identifier": "urn:nasa:pds:blah_test::1.0",
                        "identifierType": "Site ID"
                    }
                ],
                "creators": [
                    {
                        "nameType": "Personal",
                        "name": "J. R. Johnson",
                        "nameIdentifiers": [
                        ]
                    }
                ],
                "titles": [
                    {
                        "title": "blah test 1.0 Bundle",
                        "lang": "en"
                    }
                ],
                "publisher": "NASA Planetary Data System",
                "publicationYear": "2020",
                "subjects": [
                    { "subject": "PDS" },
                    { "subject": "PDS4" }
                ],
                "contributors": [
                    {
                        "nameType": "Organizational",
                        "name": "Planetary Data System: Engineering Node",
                        "contributorType": "DataCurator"
                    }
                ],
                "types": {
                    "resourceTypeGeneral": "Collection",
                    "resourceType": "PDS4 Collection "
                },
                "relatedIdentifiers": [
                ],
                "created": "2022-09-21T21:21:12.964852Z",
                "updated": "2022-09-21T21:21:12.964852Z",
                "state": "draft",
                "language": "en",
                "schemaVersion": "http://datacite.org/schema/kernel-4"
            }
        }
}

diff of bodies confirms they're identical

alexdunnjpl commented 2 years ago

PostMan request succeeds even with identical headers

alexdunnjpl commented 2 years ago

*identical headers except Authorization.

2.2.1 is erroneously pulling the default config (and therefore credential placeholders) in preference to the user-specified config (see https://github.com/NASA-PDS/doi-service/pull/354).

With that fixed, call succeeds from doi-service.

alexdunnjpl commented 2 years ago

Best informed guess:

There was a bug that resulted in the user-defined configs (and ergo, the DataCite credentials) failing to load. I fixed that a month ago https://github.com/NASA-PDS/doi-service/pull/354

I'm not sure why it only appeared in early August (https://github.com/NASA-PDS/doi-service/issues/350), but there was never a reason for me to spend time figuring that out.

That PR won't be available until 2.3.1, if I'm understanding correctly. That also seems like it means that doi-service will be fundamentally broken until 2.3.1, at least for anyone supplying creds through the user-defined config file.

jpl-jengelke commented 2 years ago

Version 2.3.2 is on PyPi right now. Is that correct?