nextstrain / nextstrain.org

The Nextstrain website
https://nextstrain.org
GNU Affero General Public License v3.0
87 stars 49 forks source link

CI: Transient `/charon/getDataset` errors when fetching from S3 #836

Open victorlin opened 2 months ago

victorlin commented 2 months ago

First noticed in https://github.com/nextstrain/nextstrain.org/pull/811#discussion_r1571704216. I didn't get the server log in time, but just noticed it again in another CI run. This is the test failure:

  ● smoke testing URLs described in auspice_client_requests.json › Check getDatset API for a (public) nextstrain group dataset

    expect(received).toEqual(expected) // deep equality

    Expected: 200
    Received: 500

      13 |       const res = await fetch(url, {redirect: 'manual'});
      14 |
    > 15 |       expect(res.status).toEqual(testCase.expectStatusCode || 200);
         |                          ^
      16 |
      17 |       if (testCase.responseIsJson) await testResponseIsJson(res);
      18 |

      at Object.<anonymous> (test/auspice_client_requests.test.js:15:26)

and relevant lines from the server log:

Getting (nextstrain) datasets for: prefix=/groups/blab/sars-like-cov
[verbose]   [fetch] GET https://nextstrain-groups.s3.us-east-1.amazonaws.com/blab/datasets/sars-like-cov.json?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIA4BL5UZTAYW2FBTNI%2F20240422%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240422T190000Z&X-Amz-Expires=7200&X-Amz-Signature=96cc77b4e83761bb524cee5169e88b41f3b31fdf47176016a86e83944af959dd&X-Amz-SignedHeaders=host&x-id=GetObject (cache: no-cache)
[verbose]   [fetch] 200 OK https://nextstrain-data.s3.amazonaws.com/zika_root-sequence.json?versionId=xSaqFeCujRdPmjuYx_MEO8gETcvX9xfC (cache miss, timestamp 2024-04-22T19:28:16.182Z)
[verbose]   [fetch] GET https://nextstrain-data.s3.amazonaws.com/zika_root-sequence.json?versionId=xSaqFeCujRdPmjuYx_MEO8gETcvX9xfC (cache: no-cache)
[warning]   Failed to fetch v2 main JSON: FetchError: request to https://nextstrain-groups.s3.us-east-1.amazonaws.com/blab/datasets/sars-like-cov.json?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIA4BL5UZTAYW2FBTNI%2F20240422%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240422T190000Z&X-Amz-Expires=7200&X-Amz-Signature=96cc77b4e83761bb524cee5169e88b41f3b31fdf47176016a86e83944af959dd&X-Amz-SignedHeaders=host&x-id=GetObject failed, reason: read ECONNRESET
[verbose]   Sending FetchError: request to https://nextstrain-groups.s3.us-east-1.amazonaws.com/blab/datasets/sars-like-cov.json?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIA4BL5UZTAYW2FBTNI%2F20240422%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240422T190000Z&X-Amz-Expires=7200&X-Amz-Signature=96cc77b4e83761bb524cee5169e88b41f3b31fdf47176016a86e83944af959dd&X-Amz-SignedHeaders=host&x-id=GetObject failed, reason: read ECONNRESET error as JSON

Looks like some networking error, possibly on the GitHub hosted runner. I don't think there's anything we can do about this, but opening an issue to keep track of any future occurrences.