Closed otizonaizit closed 2 months ago
OK, at least now I can confirm that the exception freezes the server. For proper debugging we need of course an easier reproducer.
Ok, I can reproduce it. The problem occurs because we hang in def handle_known_client
when sending to the pair socket of the broken player.
There needs to be a timeout on send (in fact all recv and sends need to have timeouts) in that function. Additionally, it might make sense to have the whole server rewritten async-style so that a slow send for whatever reason does not block the server.
I am reporting here an issue for future reference, because at the moment I can not implement a simple reproducer.
While running in server mode:
server$ pelita-server remote-server --address XXXXX --port YYYY --config pelita_server_conf.yaml
if one of the configured bot throws an Exception during a game, it seems like the server keeps on running (systemd reports the service as running) but it stops responding to new remote requests:
The Exception found in the log was the following one:
I am assuming the Exception and the freeze are connected, but a reproducer would be needed to really make sure that the two things are indeed causally connected :-)