DataBiosphere / azul

Metadata indexer and query service used for AnVIL, HCA, LungMAP, and CGP
Apache License 2.0
6 stars 2 forks source link

Internal Server Error from the /repository/files endpoint #5034

Open dsotirho-ucsc opened 1 year ago

dsotirho-ucsc commented 1 year ago

The 3/2/2023 Invicti scan encountered a HTTP 500 Internal Server Error requesting the URL https://service.dev.singlecell.gi.ucsc.edu/repository/files/swagger.json

Request:

GET /repository/files/swagger.json HTTP/1.1
Host: service.dev.singlecell.gi.ucsc.edu
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Accept-Encoding: gzip, deflate
Accept-Language: en-us,en;q=0.5
Cache-Control: no-cache
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.0 Safari/537.36

Response:

HTTP/1.1 500 Internal Server Error
Access-Control-Allow-Headers: Authorization,Content-Type,X-Amz-Date,X-Amz-Security-Token,X-Api-Key
X-Amz-Cf-Id: ps7847gScQ0qSXzgMSikr3dLcZ-YRBlPUzdyTdFcXUWzbCD58tbSvQ==
Connection: keep-alive
X-Amzn-Trace-Id: Root=1-64017693-0b6ffd000f81ef6630e464bf;Sampled=0
x-amz-apigw-id: BL93FFdcoAMFdoQ=
Content-Length: 2511
X-Cache: Error from cloudfront
Access-Control-Allow-Origin: *
Content-Type: text/plain
X-Amz-Cf-Pop: SFO5-P1
Date: Fri, 03 Mar 2023 04:24:52 GMT
Via: 1.1 7813f25b54ee9165a28165976654eb52.cloudfront.net (CloudFront)
x-amzn-RequestId: 3d
…
ext/plain
X-Amz-Cf-Pop: SFO5-P1
Date: Fri, 03 Mar 2023 04:24:52 GMT
Via: 1.1 7813f25b54ee9165a28165976654eb52.cloudfront.net (CloudFront)
x-amzn-RequestId: 3d980ca1-5afd-4a74-9da3-4dcebfea4998

Traceback (most recent call last):
File "/var/task/azul/service/source_service.py", line 70, in list_sources
sources = self._get(cache_key)
File "/var/task/azul/service/source_service.py", line 108, in _get
raise Expired(key)
azul.service.source_service.Expired: Entry for key 'dcp2:' is expired

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/var/task/chalice/app.py", line 1913, in _get_view_function_response
response = view_function(**function_args)
File "/var/task/app.py", line 1634, in repository_files
result = _repository_files(file_uuid, fetch=False)
File "/var/task/app.py", line 1713, in
…

achave11-ucsc commented 1 year ago

@hannes-ucsc: "Spike to gather relevant CloudWatch log entries."

dsotirho-ucsc commented 1 year ago

CloudWatch Logs Insights
region: us-east-1
log-group-names: /aws/lambda/azul-service-dev
query-string:

fields @timestamp, @message
| filter @requestId = '6ed47590-fa0b-48a6-b595-99837d82a172'
| sort @timestamp asc
| limit 200

@timestamp: 2023-03-03 04:24:51.435 @message:

START RequestId: 6ed47590-fa0b-48a6-b595-99837d82a172 Version: $LATEST


@timestamp: 2023-03-03 04:24:51.436 @message:

[INFO] 2023-03-03T04:24:51.436Z 6ed47590-fa0b-48a6-b595-99837d82a172 azul.chalice Received GET request for '/repository/files/swagger.json', with {"query": null, "headers": {"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,/;q=0.8", "accept-encoding": "gzip, deflate", "accept-language": "en-us,en;q=0.5", "cache-control": "no-cache", "cloudfront-forwarded-proto": "https", "cloudfront-is-desktop-viewer": "true", "cloudfront-is-mobile-viewer": "false", "cloudfront-is-smarttv-viewer": "false", "cloudfront-is-tablet-viewer": "false", "cloudfront-viewer-asn": "5739", "cloudfront-viewer-country": "US", "host": "service.dev.singlecell.gi.ucsc.edu", "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.0 Safari/537.36", "via": "1.1 7813f25b54ee9165a28165976654eb52.cloudfront.net (CloudFront)", "x-amz-cf-id": "ps7847gScQ0qSXzgMSikr3dLcZ-YRBlPUzdyTdFcXUWzbCD58tbSvQ==", "x-amzn-trace-id": "Root=1-64017693-0b6ffd000f81ef6630e464bf", "x-forwarded-for": "128.114.111.222, 15.158.0.23", "x-forwarded-port": "443", "x-forwarded-proto": "https"}}.


@timestamp: 2023-03-03 04:24:51.436 @message:

[INFO] 2023-03-03T04:24:51.436Z 6ed47590-fa0b-48a6-b595-99837d82a172 azul.chalice Did not authenticate request.


@timestamp: 2023-03-03 04:24:51.444 @message:

[DEBUG] 2023-03-03T04:24:51.444Z 6ed47590-fa0b-48a6-b595-99837d82a172 azul.terra _request('GET', https://jade.datarepo-dev.broadinstitute.org/api/repository/v1/snapshots?offset=0&limit=1000&sort=created_date&direction=asc&filter=hca_dev_, headers=None, timeout=5.0, body=None)


@timestamp: 2023-03-03 04:24:51.995 @message:

[DEBUG] 2023-03-03T04:24:51.994Z    6ed47590-fa0b-48a6-b595-99837d82a172    azul.terra  _request(…) -> 0.550s 502 b'<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>502 Proxy Error</title>\n</head><body>\n<h1>Proxy Error</h1>\n<p>The proxy server received an invalid\r\nresponse from an upstream server.<br />\r\nThe proxy server could not handle the req...'

@timestamp: 2023-03-03 04:24:51.996 @message:

[ERROR] 2023-03-03T04:24:51.995Z    6ed47590-fa0b-48a6-b595-99837d82a172    azul-service-dev    Caught exception for path /repository/files/{file_uuid}
Traceback (most recent call last):
  File "/var/task/azul/service/source_service.py", line 70, in list_sources
    sources = self._get(cache_key)
  File "/var/task/azul/service/source_service.py", line 108, in _get
    raise Expired(key)
azul.service.source_service.Expired: Entry for key 'dcp2:' is expired

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/task/chalice/app.py", line 1913, in _get_view_function_response
    response = view_function(**function_args)
  File "/var/task/app.py", line 1634, in repository_files
    result = _repository_files(file_uuid, fetch=False)
  File "/var/task/app.py", line 1713, in _repository_files
    return app.repository_controller.download_file(catalog=catalog,
  File "/var/task/azul/service/repository_controller.py", line 193, in download_file
    filters=self.get_filters(catalog, authentication, None))
  File "/var/task/azul/service/source_controller.py", line 69, in get_filters
    source_ids=self._list_source_ids(catalog, authentication))
  File "/var/task/azul/service/source_controller.py", line 60, in _list_source_ids
    sources = self.list_sources(catalog, authentication)
  File "/var/task/azul/service/source_controller.py", line 45, in list_sources
    sources = self._source_service.list_sources(catalog, authentication)
  File "/var/task/azul/service/source_service.py", line 72, in list_sources
    sources = list(plugin.list_sources(authentication))
  File "/var/task/azul/plugins/repository/tdr.py", line 107, in list_sources
    snapshots = tdr.snapshot_names_by_id(filter=filter)
  File "/var/task/azul/terra.py", line 608, in snapshot_names_by_id
    response = self._check_response(endpoint, response)
  File "/var/task/azul/terra.py", line 570, in _check_response
    raise TerraStatusException(endpoint, response)
azul.terra.TerraStatusException: ('Unexpected response from https://jade.datarepo-dev.broadinstitute.org/api/repository/v1/snapshots?offset=0&limit=1000&sort=created_date&direction=asc&filter=hca_dev_', 502, b'<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>502 Proxy Error</title>\n</head><body>\n<h1>Proxy Error</h1>\n<p>The proxy server received an invalid\r\nresponse from an upstream server.<br />\r\nThe proxy server could not handle the request<p>Reason: <strong>Error reading from remote server</strong></p></p>\n<hr>\n<address>Apache Server at jade.datarepo-dev.broadinstitute.org Port 80</address>\n</body></html>\n')

@timestamp: 2023-03-03 04:24:51.998 @message:

[DEBUG] 2023-03-03T04:24:51.998Z    6ed47590-fa0b-48a6-b595-99837d82a172    azul.chalice    Returning 500 response with headers {"Content-Type": "text/plain", "Access-Control-Allow-Origin": "*", "Access-Control-Allow-Headers": "Authorization,Content-Type,X-Amz-Date,X-Amz-Security-Token,X-Api-Key"}. See next line for the first 1024 characters of the body.
Traceback (most recent call last):
  File "/var/task/azul/service/source_service.py", line 70, in list_sources
    sources = self._get(cache_key)
  File "/var/task/azul/service/source_service.py", line 108, in _get
    raise Expired(key)
azul.service.source_service.Expired: Entry for key 'dcp2:' is expired

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/task/chalice/app.py", line 1913, in _get_view_function_response
    response = view_function(**function_args)
  File "/var/task/app.py", line 1634, in repository_files
    result = _repository_files(file_uuid, fetch=False)
  File "/var/task/app.py", line 1713, in _repository_files
    return app.repository_controller.download_file(catalog=catalog,
  File "/var/task/azul/service/repository_controller.py", line 193, in download_file
    filters=self.get_filters(catalog, authentication, None))
  File "/var/task/azul/service/source_controller.py", line 69, in get_filters
    source_ids=self._lis

@timestamp: 2023-03-03 04:24:51.999 @message:

END RequestId: 6ed47590-fa0b-48a6-b595-99837d82a172


@timestamp: 2023-03-03 04:24:51.999 @message:

REPORT RequestId: 6ed47590-fa0b-48a6-b595-99837d82a172 Duration: 563.92 ms Billed Duration: 564 ms Memory Size: 2048 MB Max Memory Used: 167 MB

achave11-ucsc commented 1 year ago

@hannes-ucsc: "The cause is the same as in #4684 but the resolution for the latter would only work around the cause during IT, not in the deployed application which is what happened here."