NASA-PDS / pds-api

PDS web APIs specifications and user's manual
http://nasa-pds.github.io/pds-api
Other
5 stars 3 forks source link

API Client cannot connect to current deployed API #240

Closed jordanpadams closed 1 year ago

jordanpadams commented 1 year ago

๐Ÿ› Describe the bug

pds-deep-archive-registry no longer works on the current API due to a disconnect between the PDS API and our currently configured online API. this may actually be an issue with our cloud deployment of the API.

๐Ÿ“œ To Reproduce

From trying to run pds-deep-registry-archive (installation instructions here)

pds-deep-registry-archive --url https://pds.nasa.gov/api/search/1.0/ --site PDS_RNG urn:nasa:pds:cassini_iss_cruise::1.0
INFO ๐Ÿ‘Ÿ PDS Deep Registry-based Archive, version 1.1.2
ERROR ๐Ÿ’ฅ We got an unexpected error; sorry it didn't work out
Traceback (most recent call last):
  File "/Users/webmaster/.virtualenvs/pds-deep-archive/lib/python3.9/site-packages/pds2/aipgen/registry.py", line 454, in main
    generatedeeparchive(args.url, args.bundle, [args.site](https://urldefense.us/v3/__http://args.site__;!!PvBDto6Hs4WbVuu7!fs8JIdsZZsRzEmEJtD2DEGa2gT7EeXrqIMq4QOHSlP0z9_lURH-XwGsLEdIv76Tp9Z4aworAHw$), not args.include_latest_collection_only)
  File "/Users/webmaster/.virtualenvs/pds-deep-archive/lib/python3.9/site-packages/pds2/aipgen/registry.py", line 429, in generatedeeparchive
    prefixlen, bac, title = _comprehendregistry(url, bundlelidvid, allcollections)
  File "/Users/webmaster/.virtualenvs/pds-deep-archive/lib/python3.9/site-packages/pds2/aipgen/registry.py", line 283, in _comprehendregistry
    bundle = _getbundle(apiclient, bundlelidvid)  # There's no class "Bundle" but class Product ๐Ÿคทโ€โ™€๏ธ
  File "/Users/webmaster/.virtualenvs/pds-deep-archive/lib/python3.9/site-packages/pds2/aipgen/registry.py", line 161, in _getbundle
    return bundles.bundle_by_lidvid(lidvid)  # type = ``Product_Bundle``
  File "/Users/webmaster/.virtualenvs/pds-deep-archive/lib/python3.9/site-packages/pds/api_client/api/bundles_api.py", line 433, in bundle_by_lidvid
    return self.bundle_by_lidvid_endpoint.call_with_http_info(**kwargs)
  File "/Users/webmaster/.virtualenvs/pds-deep-archive/lib/python3.9/site-packages/pds/api_client/api_client.py", line 849, in call_with_http_info
    return self.api_client.call_api(
  File "/Users/webmaster/.virtualenvs/pds-deep-archive/lib/python3.9/site-packages/pds/api_client/api_client.py", line 410, in call_api
    return self.__call_api(resource_path, method,
  File "/Users/webmaster/.virtualenvs/pds-deep-archive/lib/python3.9/site-packages/pds/api_client/api_client.py", line 204, in __call_api
    raise e
  File "/Users/webmaster/.virtualenvs/pds-deep-archive/lib/python3.9/site-packages/pds/api_client/api_client.py", line 197, in __call_api
    response_data = self.request(
  File "/Users/webmaster/.virtualenvs/pds-deep-archive/lib/python3.9/site-packages/pds/api_client/api_client.py", line 436, in request
    return self.rest_client.GET(url,
  File "/Users/webmaster/.virtualenvs/pds-deep-archive/lib/python3.9/site-packages/pds/api_client/rest.py", line 234, in GET
    return self.request("GET", url,
  File "/Users/webmaster/.virtualenvs/pds-deep-archive/lib/python3.9/site-packages/pds/api_client/rest.py", line 226, in request
    raise ServiceException(http_resp=r)
pds.api_client.exceptions.ServiceException: (503)
Reason: Service Unavailable
HTTP response headers: HTTPHeaderDict({'Content-Type': 'text/html; charset=iso-8859-1', 'Content-Length': '299', 'Connection': 'keep-alive', 'Date': 'Fri, 09 Dec 2022 20:33:29 GMT', 'Set-Cookie': 'AWSALB=RhBVq317xY2uynfHhgKmzL+edgR4aAqXu68jRie6NNMBJdiRArrIg6tK31uzUDFj5QMprXOlkVJsBi/uMbXACbmhS1QRO7oBFFRW6b2oERGjZOp2l5nFqlvfa9F5; Expires=Fri, 16 Dec 2022 20:33:29 GMT; Path=/, AWSALBCORS=RhBVq317xY2uynfHhgKmzL+edgR4aAqXu68jRie6NNMBJdiRArrIg6tK31uzUDFj5QMprXOlkVJsBi/uMbXACbmhS1QRO7oBFFRW6b2oERGjZOp2l5nFqlvfa9F5; Expires=Fri, 16 Dec 2022 20:33:29 GMT; Path=/; SameSite=None; Secure', 'Server': 'Apache', 'X-Frame-Options': 'SAMEORIGIN', 'X-Cache': 'Error from cloudfront', 'Via': '1.1 [cb0b891eddf58d69d157d55977c68bce.cloudfront.net](https://urldefense.us/v3/__http://cb0b891eddf58d69d157d55977c68bce.cloudfront.net__;!!PvBDto6Hs4WbVuu7!fs8JIdsZZsRzEmEJtD2DEGa2gT7EeXrqIMq4QOHSlP0z9_lURH-XwGsLEdIv76Tp9Z6ZVueTBw$) (CloudFront)', 'X-Amz-Cf-Pop': 'SFO53-P2', 'X-Amz-Cf-Id': '5WMqIkUOboUt7s3ciWWebAiPFeZVYz1G5ZVaGmkMOm3LDw08htS_ZQ==', 'X-XSS-Protection': '1; mode=block', 'Strict-Transport-Security': 'max-age=31536000; preload', 'Vary': 'Origin'})
HTTP response body: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Unavailable</title>
</head><body>
<h1>Service Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
</body></html>

INFO ๐Ÿ‘‹ Thanks for using this program! Bye!

But even running the API Client using the Quickstart guide with https://pds.nasa.gov/api/search/1.0/ as the API endpoint fails:

>>> try:
...     api_response = collections.get_collection(start=0, limit=20)
...     pprint(api_response)
... except ApiException as e:
...     print("Exception when calling CollectionsApi->get_collection: %s\n" % e)
...
Exception when calling CollectionsApi->get_collection: (503)
Reason: Service Unavailable
HTTP response headers: HTTPHeaderDict({'Content-Type': 'text/html; charset=iso-8859-1', 'Content-Length': '299', 'Connection': 'keep-alive', 'Date': 'Sun, 11 Dec 2022 17:09:47 GMT', 'Set-Cookie': 'AWSALB=u3FppiDemWBx1pvR+E7RgHCPCsH2Eywhn8mxJuRFCe2rytqhNFy0l5x7IExdrSAaCn7tyJ6QnQNDMQmIHSO9h8sBHrEl0K0WjTi3o9pr8iJV5nByxqe8jT7XryZS; Expires=Sun, 18 Dec 2022 17:09:47 GMT; Path=/, AWSALBCORS=u3FppiDemWBx1pvR+E7RgHCPCsH2Eywhn8mxJuRFCe2rytqhNFy0l5x7IExdrSAaCn7tyJ6QnQNDMQmIHSO9h8sBHrEl0K0WjTi3o9pr8iJV5nByxqe8jT7XryZS; Expires=Sun, 18 Dec 2022 17:09:47 GMT; Path=/; SameSite=None; Secure', 'Server': 'Apache', 'X-Frame-Options': 'SAMEORIGIN', 'X-Cache': 'Error from cloudfront', 'Via': '1.1 6ae304c394ca48eaeac474c114a24c88.cloudfront.net (CloudFront)', 'X-Amz-Cf-Pop': 'LAX3-C3', 'X-Amz-Cf-Id': '0onFyuo0cCEFj6K2kRHS3OhRW24JBObI76Krl4729QQvHD-rDIOMmg==', 'X-XSS-Protection': '1; mode=block', 'Strict-Transport-Security': 'max-age=31536000; preload', 'Vary': 'Origin'})
HTTP response body: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Unavailable</title>
</head><body>
<h1>Service Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>

๐Ÿ•ต๏ธ Expected behavior

๐Ÿ“š Version of Software Used

๐Ÿฉบ Test Data / Additional context

๐ŸžScreenshots

๐Ÿ–ฅ System Info


๐Ÿฆ„ Related requirements

โš™๏ธ Engineering Details

Task list:

jordanpadams commented 1 year ago

@jimmie @tloubrieu-jpl @nutjob4life see task list at bottom of ticket for what I think needs to be done here. this should go towards the top of your list for Monday.

jimmie commented 1 year ago

API on pds.nasa.gov is responding w/ valid responses. I'm not familiar with what endpoint pds-gamma is currently supporting.

jordanpadams commented 1 year ago

@jimmie have you been able to test out the client code?

tloubrieu-jpl commented 1 year ago

1- I have been able to reproduce the error 503 with the pds.api-client when the root url of the API contains a / at the end as in the example https://pds.nasa.gov/api/search/1.0/.

We can reproduce that with request: curl -H Accept:application/json 'https://pds.nasa.gov/api/search/1.0//bundles/urn:nasa:pds:cassini_iss_cruise::1.0'

The naked pds.api-client generated by the openapi-generator does not handle this extra '/' by default. I guess that should be done by the client library. I can check for the next version if there is an option to do that, otherwise we will need to wait for the wrapper to be developed.

In the meantime, deep-archive should also check that there is not extra '/'

2 - When the is no extra /, and the lidvid exists, pds.api-client work, and equivlent request is:

curl -H Accept:application/json 'https://pds.nasa.gov/api/search/1.0/bundles/urn:nasa:pds:duxbury_pdart14_mariner69::2.0'

3 - but without the extra / and with a not found lidvid, we are getting an internal server error (500) instead of 404. The equivalent curl request is: curl -H Accept:application/json 'https://pds.nasa.gov/api/search/1.0/bundles/urn:nasa:pds:cassini_iss_cruise::1.0'

The server side logs gives: java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds for length 0

So I assume this is a registry-api bug.

I checked with the version for which I am preparing a point build, and the bug is solved:

% curl -H Accept:application/json 'http://localhost:8080/bundles/urn:nasa:pds:cassini_iss_cruise::1.0' --verbose
*   Trying ::1:8080...
* TCP_NODELAY set
* Connected to localhost (::1) port 8080 (#0)
> GET /bundles/urn:nasa:pds:cassini_iss_cruise::1.0 HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.68.0
> Accept:application/json
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 404 
< Content-Disposition: inline;filename=f.txt
< Content-Type: application/json
< Content-Length: 133
< Date: Tue, 13 Dec 2022 04:08:08 GMT
< 
* Connection #0 to host localhost left intact
{"request":"/bundles/urn:nasa:pds:cassini_iss_cruise::1.0","message":"The lidvid urn:nasa:pds:cassini_iss_cruise::1.0 was not found"}%

The pds.api-client in version 1.1.2 works well on the new server for this case as well.

jordanpadams commented 1 year ago

@tloubrieu-jpl thanks! this is great news. I think this is the similar to the issue Pat L. was encountering with handling arrays vs. string values.

tloubrieu-jpl commented 1 year ago

After breakout today, @jimmie will update the cloud front script which rewrite the URL sent to registry API to remove extra / at the beginning of the path sent to the registry API. For example ///bundles/lid::vid becomes /bundles/lid::vid

The script will be updated in the regsitry-api repository and update will be done through a pull request on this repository. The new script will be deployed manually in cloud front.

tloubrieu-jpl commented 1 year ago

@jordanpadams I was thinking the issue at Pat L. is related to the structure of the documents inside OpenSearch. I don't know if it is related. Anyway for this bug on not found lidvid which happens on the upgraded version of the registry-api in production 1.1. (it still it does not happen on my dev/integration deployment). I will create a different ticket for it: https://github.com/NASA-PDS/registry-api/issues/207

tloubrieu-jpl commented 1 year ago

@jordanpadams this issue should be solved by @jimmie having deployed a fix in the cloud front script which forwards the request to the API server. See https://github.com/NASA-PDS/registry-api/issues/208

Is there a user we should communicate the fix to ? We also need to tell him to use the option to set the api URL, until we make a point build for deep archive.

jordanpadams commented 1 year ago

looks like connection issue is fixed but created a few other bugs that are now blocking https://github.com/NASA-PDS/deep-archive/issues/134