[Feature] Graceful shutdown

saez0pub commented 1 year ago

Description

I'm using graph-node in kubernetes with a sql proxy to connect to postgresql.

When I do a replace of the graph-node, kubernetes sends a SIGTERM and will wait 30s before forcing a shutdown. I'm waiting 30s in the SQL proxy to let the graph-node to finish its queries. But the graph-node does not stop, It just sends errors because the database goes away and waits for kubernetes to force the kill of the pod.

A nice feature would be to permit a graceful shutdown of the graph-node. It should stop indexing on SIGTERM, finish its sql queries and disconnect to the database. More, it can prevent some unexpected issues in postgresql database like bad block caching.

An indexer node or query has actually the same behaviour.

Are you aware of any blockers that must be resolved before implementing this feature? If so, which? Link to any relevant GitHub issues.

No response

Some information to help us out

[ ] Tick this box if you plan on implementing this feature yourself.
[X] I have searched the issue tracker to make sure this issue is not a duplicate.

aasseman commented 1 year ago

:100: I can confirm this. Very annoying when used with container orchestration.

leoyvens commented 1 year ago

Graph Node should be able to crash at any point without causing issues when restarting. So one solution to not wait and instantly terminate might be to set the docker stopsignal to sigkill in our docker image.

aasseman commented 1 year ago

The problem is that docker compose as well as k8s only send SIGTERM, then wait for 2 minutes, and then SIGKILL. That makes any update/restart extremely tedious. I haven't encountered this problem in container orchestration before apart for the indexer software.

github-actions[bot] commented 9 months ago

Looks like this issue has been open for 6 months with no activity. Is it still relevant? If not, please remember to close it.

saez0pub commented 9 months ago

The issue is still relevant.

paymog commented 7 months ago

@leoyvens when running graph node in query only mode - will graph node allow existing in flight queries to complete upon recieving a SIGTERM/SIGINT or will graph node exit immediately and abort existing queries?

paymog commented 7 months ago

If this gets implemented, we should wait for all pending queries against all servers to terminate: websocket, admin, indexing and query.

graphprotocol / graph-node