HumanCellAtlas / dcp2

Shared artifacts concerning the Human Cell Atlas (HCA) Data Coordination Platform (DCP)
4 stars 2 forks source link

TDR's enumerateSnapshots takes more than 30s #57

Open achave11-ucsc opened 2 years ago

achave11-ucsc commented 2 years ago

The managed-access implementation in Azul requires that we determine the list of snapshots accessible to the requesting user. Azul uses TDR's enumerateSnapshots endpoint for that. Azul has to make this request frequently.

We noticed that sometimes enumerateSnapshots takes thirty seconds or more. We are currently unsure how pervasive this problem is. Manual tests established that usually this request completes in under one second. Depending on how pervasive it is, this could severely impact the Data Browser user experience by causing timeouts or just generally sluggish response times. Azul caches the listing (currently for one minute, and that's adjustable) but the cache is going to be empty when a user first visits the Data Browser. And first impressions are lasting.

Below is an example of a manifest request serviced by Azul. It timed out after 30s waiting for TDR.

@timestamp @message
2022-02-25 23:46:19.612 START RequestId: 7ed3c46b-dc02-4b1d-9a4b-79d2fc54aac5 Version: $LATEST
2022-02-25 23:46:19.616 [INFO] 2022-02-25T23:46:19.616Z 7ed3c46b-dc02-4b1d-9a4b-79d2fc54aac5 Received GET request for '/manifest/files', with query {"catalog": "dcp13-it"} and headers {"accept-encoding": "identity", "cloudfront-forwarded-proto": "https", "cloudfront-is-desktop-viewer": "true", "cloudfront-is-mobile-viewer": "false", "cloudfront-is-smarttv-viewer": "false", "cloudfront-is-tablet-viewer": "false", "cloudfront-viewer-country": "US", "host": "service.azul2.data.humancellatlas.org", "user-agent": "python-urllib3/1.26.5", "via": "1.1 5c13f6a020624d4a85d1d1ae51108d7a.cloudfront.net (CloudFront)", "x-amz-cf-id": "37Jmtr0VXVj8XdYD4gANecYcshaGMtPRCpdv0XNsE1R7-KipJJzI1A==", "x-amzn-trace-id": "Root=1-62196a4b-59df06d966f57da128e87117", "x-forwarded-for": "76.89.218.122, 130.176.174.135", "x-forwarded-port": "443", "x-forwarded-proto": "https"}.
2022-02-25 23:46:19.616 [INFO] 2022-02-25T23:46:19.616Z 7ed3c46b-dc02-4b1d-9a4b-79d2fc54aac5 Did not authenticate request.
2022-02-25 23:46:19.645 [DEBUG] 2022-02-25T23:46:19.645Z 7ed3c46b-dc02-4b1d-9a4b-79d2fc54aac5 _request('GET', 'https://data.terra.bio/api/repository/v1/snapshots', fields={'offset': 0, 'limit': 200}, headers=None, body=None)
2022-02-25 23:46:50.646 END RequestId: 7ed3c46b-dc02-4b1d-9a4b-79d2fc54aac5
2022-02-25 23:46:50.646 REPORT RequestId: 7ed3c46b-dc02-4b1d-9a4b-79d2fc54aac5 Duration: 31031.56 ms Billed Duration: 31000 ms Memory Size: 2048 MB Max Memory Used: 162 MB
2022-02-25 23:46:50.646 2022-02-25T23:46:50.646Z 7ed3c46b-dc02-4b1d-9a4b-79d2fc54aac5 Task timed out after 31.03 seconds