spesmilo / electrumx

Alternative implementation of spesmilo/electrum-server
MIT License
432 stars 343 forks source link

Crash with "RuntimeError: task group terminated" #170

Closed freerko closed 2 years ago

freerko commented 2 years ago

Hi, I run ElectrumX on Ubuntu 20.04.4 with Python 3.8.10 and aiorpcX 0.22.1 with current electrumx master with IPv4, IPv6 and Tor connectivity. On one crash, I see SessionManager closing down servers, somehow connections still continue to go up and down in the next second and then get a traceback with an "RuntimeError: task group terminated".

Mar 07 17:27:39 hostname electrumx_server[454057]: INFO:ElectrumX:[70483148] SSL xx.xx.xx.xx:xx, 440 total
Mar 07 17:27:39 hostname electrumx_server[454057]: INFO:ElectrumX:[70483149] SSL xx.xx.xx.xx:xx, 441 total
Mar 07 17:27:39 hostname electrumx_server[454057]: INFO:SessionManager:closing down server for tcp://all_interfaces:50001
Mar 07 17:27:39 hostname electrumx_server[454057]: INFO:SessionManager:closing down server for wss://all_interfaces:50004
Mar 07 17:27:39 hostname electrumx_server[454057]: INFO:SessionManager:closing down server for ssl://all_interfaces:50002
Mar 07 17:27:39 hostname electrumx_server[454057]: INFO:SessionManager:closing down server for rpc://localhost:8000
Mar 07 17:27:39 hostname electrumx_server[454057]: INFO:ElectrumX:[70483150] SSL xx.xx.xx.xx:xx, 416 total
Mar 07 17:27:39 hostname electrumx_server[454057]: INFO:ElectrumX:[70483151] SSL xx.xx.xx.xx:xx, 417 total
Mar 07 17:27:39 hostname electrumx_server[454057]: INFO:ElectrumX:[70483152] SSL xx.xx.xx.xx:xx, 418 total
Mar 07 17:27:39 hostname electrumx_server[454057]: INFO:ElectrumX:[70483153] SSL xx.xx.xx.xx:xx, 419 total
Mar 07 17:27:39 hostname electrumx_server[454057]: INFO:ElectrumX:[70483154] SSL xx.xx.xx.xx:xx, 420 total
...
Mar 07 17:27:40 hostname electrumx_server[454057]: INFO:ElectrumX:[70483249] SSL xx.xx.xx.xx:xx, 388 total
Mar 07 17:27:40 hostname electrumx_server[454057]: INFO:ElectrumX:[70483250] SSL xx.xx.xx.xx:xx, 389 total
Mar 07 17:27:40 hostname electrumx_server[454057]: INFO:ElectrumX:[70483251] SSL xx.xx.xx.xx:xx, 390 total
Mar 07 17:27:40 hostname electrumx_server[454057]: INFO:ElectrumX:[70483252] SSL xx.xx.xx.xx:xx, 391 total
Mar 07 17:27:40 hostname electrumx_server[454057]: INFO:ElectrumX:[70483253] SSL xx.xx.xx.xx:xx, 392 total
Mar 07 17:27:40 hostname electrumx_server[454057]: INFO:ElectrumX:[70483254] SSL xx.xx.xx.xx:xx, 393 total
Mar 07 17:27:40 hostname electrumx_server[454057]: INFO:ElectrumX:[70483255] SSL xx.xx.xx.xx:xx, 394 total
Mar 07 17:27:40 hostname electrumx_server[454057]: INFO:ElectrumX:[70483256] SSL xx.xx.xx.xx:xx, 378 total
Mar 07 17:27:40 hostname electrumx_server[454057]: INFO:Prefetcher:cancelled; prefetcher stopping
Mar 07 17:27:40 hostname electrumx_server[454057]: INFO:BlockProcessor:flushing to DB for a clean shutdown...
Mar 07 17:27:40 hostname electrumx_server[454057]: INFO:Controller:shutting down
Mar 07 17:27:40 hostname electrumx_server[454057]: INFO:Controller:shutdown complete
Mar 07 17:36:32 hostname electrumx_server[454057]: ERROR:electrumx:ElectrumX server terminated abnormally
Mar 07 17:36:32 hostname electrumx_server[454057]: Traceback (most recent call last):
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/local/lib/python3.8/dist-packages/e_x-1.16.0-py3.8.egg/EGG-INFO/scripts/electrumx_server", line 35, in main
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/lib/python3.8/asyncio/runners.py", line 44, in run
Mar 07 17:36:32 hostname electrumx_server[454057]:     return loop.run_until_complete(main)
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
Mar 07 17:36:32 hostname electrumx_server[454057]:     return future.result()
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/local/lib/python3.8/dist-packages/e_x-1.16.0-py3.8.egg/electrumx/lib/server_base.py", line 129, in run
Mar 07 17:36:32 hostname electrumx_server[454057]:     await server_task
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/local/lib/python3.8/dist-packages/e_x-1.16.0-py3.8.egg/electrumx/lib/server_base.py", line 102, in serve
Mar 07 17:36:32 hostname electrumx_server[454057]:     await self.serve(shutdown_event)
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/local/lib/python3.8/dist-packages/e_x-1.16.0-py3.8.egg/electrumx/server/controller.py", line 134, in serve
Mar 07 17:36:32 hostname electrumx_server[454057]:     await group.spawn(wait_for_catchup())
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/local/lib/python3.8/dist-packages/aiorpcx/curio.py", line 297, in __aexit__
Mar 07 17:36:32 hostname electrumx_server[454057]:     await self.join()
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/local/lib/python3.8/dist-packages/e_x-1.16.0-py3.8.egg/electrumx/lib/util.py", line 370, in join
Mar 07 17:36:32 hostname electrumx_server[454057]:     task.result()
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/local/lib/python3.8/dist-packages/e_x-1.16.0-py3.8.egg/electrumx/server/mempool.py", line 366, in keep_synchronized
Mar 07 17:36:32 hostname electrumx_server[454057]:     await group.spawn(self._logging(synchronized_event))
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/local/lib/python3.8/dist-packages/aiorpcx/curio.py", line 297, in __aexit__
Mar 07 17:36:32 hostname electrumx_server[454057]:     await self.join()
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/local/lib/python3.8/dist-packages/e_x-1.16.0-py3.8.egg/electrumx/lib/util.py", line 370, in join
Mar 07 17:36:32 hostname electrumx_server[454057]:     task.result()
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/local/lib/python3.8/dist-packages/e_x-1.16.0-py3.8.egg/electrumx/server/mempool.py", line 263, in _refresh_hashes
Mar 07 17:36:32 hostname electrumx_server[454057]:     await self.api.on_mempool(touched, height)
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/local/lib/python3.8/dist-packages/e_x-1.16.0-py3.8.egg/electrumx/server/controller.py", line 67, in on_mempool
Mar 07 17:36:32 hostname electrumx_server[454057]:     await self._maybe_notify()
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/local/lib/python3.8/dist-packages/e_x-1.16.0-py3.8.egg/electrumx/server/controller.py", line 55, in _maybe_notify
Mar 07 17:36:32 hostname electrumx_server[454057]:     await self.notify(height, touched)
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/local/lib/python3.8/dist-packages/e_x-1.16.0-py3.8.egg/electrumx/server/session.py", line 789, in _notify_sessions
Mar 07 17:36:32 hostname electrumx_server[454057]:     await self._task_group.spawn(session.notify, touched, height_changed)
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/local/lib/python3.8/dist-packages/aiorpcx/curio.py", line 204, in spawn
Mar 07 17:36:32 hostname electrumx_server[454057]:     self._add_task(task)
Mar 07 17:36:32 hostname electrumx_server[454057]:   File "/usr/local/lib/python3.8/dist-packages/aiorpcx/curio.py", line 150, in _add_task
Mar 07 17:36:32 hostname electrumx_server[454057]:     raise RuntimeError('task group terminated')
Mar 07 17:36:32 hostname electrumx_server[454057]: RuntimeError: task group terminated
Mar 07 17:45:41 hostname systemd[1]: electrumx.service: Succeeded.
SomberNight commented 2 years ago

The traceback is about an error during shutdown. Note however that it's not clear from your log why the server started to shutdown in the first place.

Regardless, re the traceback, the cause is that while the SessionManager is stopping, Notifications are still coming in from e.g. the mempool as mempool.keep_synchronized is not yet cancelled at that point. https://github.com/spesmilo/electrumx/blob/8198ab99603a7e74083b3569c15dad13850c721e/electrumx/server/session.py#L636-L648 https://github.com/spesmilo/electrumx/blob/8198ab99603a7e74083b3569c15dad13850c721e/electrumx/server/controller.py#L126-L134