daimo-eth / daimo

Real world Ethereum
https://daimo.com
GNU General Public License v3.0
358 stars 29 forks source link

Incident: API down #1160

Closed dcposch closed 3 months ago

dcposch commented 3 months ago

Summary

2h API downtime, triggered by null value in shovel DB

Timeline

2024-06-15, Pacific time

Investigation

Watcher ticks failing

https://api.daimo.xyz/chain/8453/health

Shovel

Logs look OK. Latest block looks correct in Shovel DB:

image

^ matches Basescan

image

https://logs.betterstack.com/team/183561/tail?s=796523

Error in requestIndexer:

2024-06-15 13:37:23.150 [daimo_api] DaimoApiCluster-production api-task-production:21 stderr [SHOVEL] tick error TypeError: Cannot read properties of null (reading 'length')
2024-06-15 13:37:23.150 [daimo_api] DaimoApiCluster-production api-task-production:21 stderr     at bytesToHex (/usr/src/app/node_modules/viem/utils/encoding/toHex.ts:133:29)
2024-06-15 13:37:23.150 [daimo_api] DaimoApiCluster-production api-task-production:21 stderr     at rowToRequestCreatedLog (/usr/src/app/packages/daimo-api/src/contract/requestIndexer.ts:372:15)
2024-06-15 13:37:23.150 [daimo_api] DaimoApiCluster-production api-task-production:21 stderr     at Array.map (<anonymous>)
2024-06-15 13:37:23.150 [daimo_api] DaimoApiCluster-production api-task-production:21 stderr     at RequestIndexer.loadCreated (/usr/src/app/packages/daimo-api/src/contract/requestIndexer.ts:127:30)
2024-06-15 13:37:23.150 [daimo_api] DaimoApiCluster-production api-task-production:21 stderr     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
2024-06-15 13:37:23.150 [daimo_api] DaimoApiCluster-production api-task-production:21 stderr     at RequestIndexer.load (/usr/src/app/packages/daimo-api/src/contract/requestIndexer.ts:70:23)
2024-06-15 13:37:23.150 [daimo_api] DaimoApiCluster-production api-task-production:21 stderr     at async Promise.all (index 1)
2024-06-15 13:37:23.150 [daimo_api] DaimoApiCluster-production api-task-production:21 stderr     at Watcher.index (/usr/src/app/packages/daimo-api/src/shovel/watcher.ts:163:7)
2024-06-15 13:37:23.150 [daimo_api] DaimoApiCluster-production api-task-production:21 stderr     at Timeout._onTimeout (/usr/src/app/packages/daimo-api/src/shovel/watcher.ts:121:29)
2024-06-15 13:37:23.540 [daimo_api] DaimoApiCluster-production api-task-production:21 stderr [SWAPCOIN] SKIPPING 15844324-15847246, already processed thru 15844324
nalinbhardwaj commented 3 months ago

Culprit Shovel row

image (2)

Unclear why this is null, current best guesses:

  1. Fetched from Alchemy, maybe alchemy returned bad data?
  2. Some panics around the fetching, could be related
dcposch commented 3 months ago

Fixes

As a follow up, we could also add typing to our SQL via eg. Prisma or TypeSQL. In this case, this would've replaced the any in function rowToRequestCreatedLog(r: any), and we would've caught that required columns like metadata were actually nullable.

nalinbhardwaj commented 3 months ago

Cause request transaction: https://basescan.org/tx/0xe08e8dc1f62566f8f0e0ce0289ba5fd41cfc555ec61afe73eccf54db6540be0c