graphprotocol / mission-control-indexer

Technical indexer documentation and infrastructure templates for the Mission Control testnet
21 stars 4 forks source link

Agent node and service graceful crash. #235

Open yasiryagi opened 3 years ago

yasiryagi commented 3 years ago

Can we make the failure to connect to ethereum crash more graceful and clear in messaging.

When my nodes failed to connect to Ethereum contract due to hitting my provider rate limit, both node crashed (example below). TIt would be nicer if it spit and error message alerting the admin and not crashed and existed.

indexer-service_1 | 2020-12-03T11:11:35.935151698Z {"level":30,"time":1606993895934,"pid":1,"hostname":"e74b936d8092","name":"IndexerService","version":"0.4.3-alpha.1","msg":"Starting up..."} indexer-service_1 | 2020-12-03T11:11:36.002573048Z {"level":30,"time":1606993896002,"pid":1,"hostname":"e74b936d8092","name":"IndexerService","indexer":"0x6c0Fadd48e7e236Bb10f7D69148Be5502a18ca57","operator":"0x6c0Fadd48e7e236Bb10f7D69148Be5502a18ca57","host":"192.168.128.252","port":5432,"database":"indexerservice","msg":"Connect to database"} indexer-service_1 | 2020-12-03T11:11:36.056756258Z {"level":20,"time":1606993896056,"pid":1,"hostname":"e74b936d8092","name":"IndexerService","component":"MetricsServer","component":"MetricsServer","port":7300,"msg":"Listening on port"} indexer-service_1 | 2020-12-03T11:11:36.114142957Z {"level":30,"time":1606993896113,"pid":1,"hostname":"e74b936d8092","name":"IndexerService","indexer":"0x6c0Fadd48e7e236Bb10f7D69148Be5502a18ca57","operator":"0x6c0Fadd48e7e236Bb10f7D69148Be5502a18ca57","msg":"Successfully connected to database"} indexer-service_1 | 2020-12-03T11:11:36.114185853Z {"level":30,"time":1606993896113,"pid":1,"hostname":"e74b936d8092","name":"IndexerService","indexer":"0x6c0Fadd48e7e236Bb10f7D69148Be5502a18ca57","operator":"0x6c0Fadd48e7e236Bb10f7D69148Be5502a18ca57","msg":"Connect to network"} indexer-service_1 | 2020-12-03T11:11:36.116422141Z {"level":30,"time":1606993896116,"pid":1,"hostname":"e74b936d8092","name":"IndexerService","indexer":"0x6c0Fadd48e7e236Bb10f7D69148Be5502a18ca57","operator":"0x6c0Fadd48e7e236Bb10f7D69148Be5502a18ca57","msg":"Successfully connected to network"} indexer-service_1 | 2020-12-03T11:11:36.116469848Z {"level":30,"time":1606993896116,"pid":1,"hostname":"e74b936d8092","name":"IndexerService","indexer":"0x6c0Fadd48e7e236Bb10f7D69148Be5502a18ca57","operator":"0x6c0Fadd48e7e236Bb10f7D69148Be5502a18ca57","provider":"https://nd-352-460-571.p2pify.com","msg":"Connecting to Ethereum"} indexer-service_1 | 2020-12-03T11:11:40.214186240Z indexer-service start indexer-service_1 | 2020-12-03T11:11:40.214242055Z indexer-service_1 | 2020-12-03T11:11:40.214255008Z Start the service indexer-service_1 | 2020-12-03T11:11:40.214267030Z indexer-service_1 | 2020-12-03T11:11:40.214279161Z Ethereum indexer-service_1 | 2020-12-03T11:11:40.214291587Z --ethereum Ethereum node or provider URL [string] [required] indexer-service_1 | 2020-12-03T11:11:40.214303999Z --mnemonic Mnemonic for the operator wallet [string] [required] indexer-service_1 | 2020-12-03T11:11:40.214316153Z --indexer-address Ethereum address of the indexer [string] [required] indexer-service_1 | 2020-12-03T11:11:40.214327203Z indexer-service_1 | 2020-12-03T11:11:40.214337604Z Indexer Infrastructure indexer-service_1 | 2020-12-03T11:11:40.214348918Z --port Port to serve queries at[number] [default: 7600] indexer-service_1 | 2020-12-03T11:11:40.214360275Z --metrics-port Port to serve Prometheus metrics at indexer-service_1 | 2020-12-03T11:11:40.214371325Z [number] [default: 7300] indexer-service_1 | 2020-12-03T11:11:40.214382663Z --graph-node-query-endpoint Graph Node endpoint to forward queries to indexer-service_1 | 2020-12-03T11:11:40.214435540Z [string] [required] indexer-service_1 | 2020-12-03T11:11:40.214448474Z --graph-node-status-endpoint Graph Node endpoint for indexing statuses etc. indexer-service_1 | 2020-12-03T11:11:40.214481070Z [string] [required] indexer-service_1 | 2020-12-03T11:11:40.214493060Z indexer-service_1 | 2020-12-03T11:11:40.214503685Z Postgres indexer-service_1 | 2020-12-03T11:11:40.214514547Z --postgres-host Postgres host [string] [required] indexer-service_1 | 2020-12-03T11:11:40.214526031Z --postgres-port Postgres port [number] [default: 5432] indexer-service_1 | 2020-12-03T11:11:40.214537617Z --postgres-username Postgres username [string] [default: "postgres"] indexer-service_1 | 2020-12-03T11:11:40.214549387Z --postgres-password Postgres password [string] [default: ""] indexer-service_1 | 2020-12-03T11:11:40.214561251Z --postgres-database Postgres database name [string] [required] indexer-service_1 | 2020-12-03T11:11:40.214572706Z indexer-service_1 | 2020-12-03T11:11:40.214583102Z Network Subgraph indexer-service_1 | 2020-12-03T11:11:40.214594197Z --network-subgraph-endpoint Endpoint to query the network subgraph from indexer-service_1 | 2020-12-03T11:11:40.214605827Z [string] [required] indexer-service_1 | 2020-12-03T11:11:40.214617410Z indexer-service_1 | 2020-12-03T11:11:40.214627814Z State Channels indexer-service_1 | 2020-12-03T11:11:40.214638495Z --wallet-worker-threads Number of worker threads for the server wallet indexer-service_1 | 2020-12-03T11:11:40.214650241Z [number] [default: 8] indexer-service_1 | 2020-12-03T11:11:40.214661925Z --wallet-skip-evm-validation Whether to skip EVM-based validation of state indexer-service_1 | 2020-12-03T11:11:40.214673478Z channel transitions [boolean] [default: true] indexer-service_1 | 2020-12-03T11:11:40.214684558Z indexer-service_1 | 2020-12-03T11:11:40.214695395Z Options: indexer-service_1 | 2020-12-03T11:11:40.214706113Z --version Show version number [boolean] indexer-service_1 | 2020-12-03T11:11:40.214717835Z --help Show help [boolean] indexer-service_1 | 2020-12-03T11:11:40.214729327Z --free-query-auth-token Auth token that clients can use to query for free indexer-service_1 | 2020-12-03T11:11:40.214742531Z [array] indexer-service_1 | 2020-12-03T11:11:40.214754082Z indexer-service_1 | 2020-12-03T11:11:40.218981911Z Error: could not detect network (event="noNetwork", code=NETWORK_ERROR, version=providers/5.0.17) indexer-service_1 | 2020-12-03T11:11:40.219031295Z at Logger.makeError (/opt/indexer/node_modules/@ethersproject/logger/lib/index.js:179:21) indexer-service_1 | 2020-12-03T11:11:40.219041517Z at Logger.throwError (/opt/indexer/node_modules/@ethersproject/logger/lib/index.js:188:20) indexer-service_1 | 2020-12-03T11:11:40.219049822Z at JsonRpcProvider. (/opt/indexer/node_modules/@ethersproject/providers/lib/json-rpc-provider.js:407:54) indexer-service_1 | 2020-12-03T11:11:40.219058752Z at step (/opt/indexer/node_modules/@ethersproject/providers/lib/json-rpc-provider.js:46:23) indexer-service_1 | 2020-12-03T11:11:40.219066623Z at Object.throw (/opt/indexer/node_modules/@ethersproject/providers/lib/json-rpc-provider.js:27:53) indexer-service_1 | 2020-12-03T11:11:40.219091237Z at rejected (/opt/indexer/node_modules/@ethersproject/providers/lib/json-rpc-provider.js:19:65) indexer-service_1 | 2020-12-03T11:11:40.219098401Z at processTicksAndRejections (internal/process/task_queues.js:97:5) { indexer-service_1 | 2020-12-03T11:11:40.219104986Z reason: 'could not detect network', indexer-service_1 | 2020-12-03T11:11:40.219111058Z code: 'NETWORK_ERROR', indexer-service_1 | 2020-12-03T11:11:40.219117191Z event: 'noNetwork' indexer-service_1 | 2020-12-03T11:11:40.219123226Z }

fattox commented 3 years ago

Yes, i also see this on my services extremely frequently. I have a handful sat behind a LB, which i don't think causes any issue but perhaps helps when one or more start doing this. It's not an issue with my endpoint (Infura) or limits there and restarting the service resolves it:

Dec 04 14:28:31 graph-test01-s run_graph_indexer_service_01.sh[191382]: (node:191382) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag '--unhandled-rejections=strict' (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 382)

It seems that at least one service does this probably every hour, it's extremely common and having it simply detect this issue and restart itself would probably be better than letting it just get stuck.