filecoin-station / spark-evaluate

Evaluate service
Other
1 stars 1 forks source link

Handle web3storage API errors #35

Closed bajtos closed 7 months ago

bajtos commented 10 months ago

Occasionally, we are not able to fetch measurements from web3.storage because of an internal server error:

Error: Response was not ok: 500 Internal Server Error - Check for { "ok": false } on the Response object before calling .files
    at Response.files (file:///app/node_modules/web3.storage/src/lib.js:540:15)
    at fetchMeasurements (file:///app/lib/preprocess.js:56:27)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async preprocess (file:///app/lib/preprocess.js:14:24)

We should fix our code to check for res.ok.

Even better, we can implement a retry logic to recover from server errors. (Ideally, we should record telemetry about these retries.)

It would be also great to record telemetry about the status codes returned by web3.storage API calls. For example, a recent log was showing 524 Timeout, but many other logs show 500 Internal Server Error.

sentry-io[bot] commented 10 months ago

Sentry issue: SPARK-EVALUATE-1BW

bajtos commented 7 months ago

I believe this is no longer a problem after https://github.com/filecoin-station/spark-evaluate/pull/70 reworked CID fetching to use a Trustless HTTP Gateway instead of a w3up client SDK.