AntelopeIO / spring

C++ implementation of the Antelope protocol with Savanna consensus
Other
9 stars 5 forks source link

snapshot loading cannot be stopped by CTRL-C #798

Open linh2931 opened 1 month ago

linh2931 commented 1 month ago

I tried to stop loading snapshot by CTRL-C but failed. The loading kept going until finished, and then nodeos exited due to the earlier CTRL-C:

info  2024-09-18T13:50:26.080 nodeos    controller.cpp:2435           operator()           ] Snapshot initialization 78% complete
^Cinfo  2024-09-18T13:50:26.083 nodeos    main.cpp:172                  operator()           ] appbase quit called
^Cinfo  2024-09-18T13:50:27.958 nodeos    main.cpp:172                  operator()           ] appbase quit called
info  2024-09-18T13:50:31.080 nodeos    controller.cpp:2435           operator()           ] Snapshot initialization 79% complete
info  2024-09-18T13:50:36.080 nodeos    controller.cpp:2435           operator()           ] Snapshot initialization 81% complete
...
info  2024-09-18T13:52:51.087 nodeos    controller.cpp:2435           operator()           ] Snapshot initialization 99% complete
info  2024-09-18T13:52:54.992 nodeos    controller.cpp:1861           startup              ] Snapshot loaded, lib: 394861245
info  2024-09-18T13:52:54.992 nodeos    controller.cpp:1589           should_replay_block_ ] no block log found
info  2024-09-18T13:52:54.992 nodeos    controller.cpp:1693           replay               ] quitting from replay because of shutdown
info  2024-09-18T13:52:54.992 nodeos    controller.cpp:1873           startup              ] Finished initialization from snapshot (snapshot load time was 403s)
info  2024-09-18T13:52:54.995 nodeos    chain_plugin.cpp:1138         plugin_startup       ] starting chain in read/write mode
heifner commented 1 month ago

What state would you expect the node to be in if interrupted by ctrl-c while loading the snapshot?

linh2931 commented 1 month ago

Just delete everything and put up a big warning before quitting. Otherwise the user has to wait for a long time if loading a big snapshot.

arhag commented 1 month ago

We cannot shutdown in the middle of snapshot loading and leave nodeos in a good state. And we do not want Ctrl-C to leave nodeos in a bad state; there is already kill -9 if you want to kill the process immediately regardless of what state it leaves it state files.

So the most we can do for this issue is to change the log message reported in reaction to Ctrl-C.

Right now it prints the message "appbase quit called".

One simple thing we can do is to just change this message to the more helpful "Shutdown pending".

Another thing we can do in addition to that is to detect the context we are in (in the middle of snapshot loading, for example) and based on that clarify to the user that nodeos will shutdown after, for example, snapshot initialization completes which may take a long time.

wdsse commented 1 month ago

hi!

Tell me how to start a node from a snapshot if after starting the node it starts synchronization and it is impossible to terminate the node correctly! the database crashes.

I start it like this: nodeos --data-dir ./blockchain --config-dir ./blockchain --snapshot $SNAP_DIR/snapshot.bin

heifner commented 1 month ago

When syncing you have to be very patient with ctrl-c or SIGTERM/SIGINT to stop a node.

See https://github.com/AntelopeIO/spring/issues/284 and https://github.com/AntelopeIO/spring/issues/527

When syncing get_info can take seconds/minutes to respond. When syncing ctrl-c can take seconds/minutes to respond. For example, I just ctrl-c a node that was syncing to EOS Mainnet and it took 9 minutes for it to shutdown.