Open gahtan-syarif opened 2 months ago
this sounds pretty reasonable to do
nvm i think we basically need check the global stop in the readProcess.
I think I have a related observation. If one engine crashes / aborts, the other processes appear to remain active (even though fast-chess quits), and I need to kill them by hand.
I think I have a related observation. If one engine crashes / aborts, the other processes appear to remain active (even though fast-chess quits), and I need to kill them by hand.
What I was able to reproduce is that when an engine crashes the error propagated back to fastchess and wasn't handled gracefully, so fastchess also threw and didn't cleanup properly. Should be now fixed https://github.com/Disservin/fast-chess/commit/9ac8a968a231efcbbb37238d35eb7b9a5109a052, the current implementation will try to continue playing with the remaining available engines until all games are played or all engines have crashed. In either case all the processes should have been stopped.
actually thinking about more about https://github.com/Disservin/fast-chess/commit/9ac8a968a231efcbbb37238d35eb7b9a5109a052 I'm not sure that this is the best solution. Stopping the match might be a better solution, probably even exiting with a non-zero exit code, unless -recover
is specified, in which case the engine should be restarted, I think. I think the current solution might too easily lead to the engine crash going unnoticed.
Well the patch first fixes something which shouldn't happen, expected behavior is something else.
I think I'll first reimplement the -recover
option and then make the tournament exit when not specified. One thing I'm unsure about -recover
is: Does it restart the engine at the point in the match where it crashed and continues with that position or is the matchup for the specific opening replayed? Also what happens when the engine keeps crashing?
Well the patch first fixes something which shouldn't happen, expected behavior is something else.
I think I'll first reimplement the
-recover
option and then make the tournament exit when not specified. One thing I'm unsure about-recover
is: Does it restart the engine at the point in the match where it crashed and continues with that position or is the matchup for the specific opening replayed? Also what happens when the engine keeps crashing?
its this i think https://github.com/cutechess/cutechess/blob/1071d84cf272bd7deca0964336bf02e367e2b22b/projects/lib/src/tournament.h#L181
so basically we already have -recover
on by default since if an engine disconnects it would just resume the tournament (and keep disconnecting), if -recover
is not specified then whole tournament stops the moment a crash happens https://github.com/cutechess/cutechess/blob/1071d84cf272bd7deca0964336bf02e367e2b22b/projects/lib/src/tournament.cpp#L781
so basically we already have -recover on by default since if an engine disconnects it would just resume the tournament
Well currently we are losing engines after a crash, so if you have as n crashes and n concurrency, the tournament won't continue because there are no more engines.
So with recover the match is just annotated as lost and it continues with the next game OK, can implement that.
so basically we already have -recover on by default since if an engine disconnects it would just resume the tournament
Well currently we are losing engines after a crash, so if you have as n crashes and n concurrency, the tournament won't continue because there are no more engines.
So with recover the match is just annotated as lost and it continues with the next game OK, can implement that.
yeah, exactly, and have it that with no -recover
the tournament finishes, saves, and cleanly exits after the disconnect game
So with recover the match is just annotated as lost and it continues with the next game OK, can implement that.
The game is annotated as a loss, and a new engine instance starts the next game.
what's the cutechess exit code after an engine crash and no -recover
? if this is a non-zero exit code I could use that... (full disclosure, I'd like to add a CI test to SF where we play a couple of games with an SF compiled with debug=yes
so that we can catch things triggering an assert in game play, but not our testing).
https://github.com/Disservin/fast-chess/pull/463
this should cover the recover option and it's associated behavior, if there is a particular need for a non zero exit code i could add that as well
you can notice this when you set the movetime to a large amount like for example st=60, the program would wait for the entire 60 seconds until the engine outputs a bestmove and only then will it save the results. it should instead send a stop command to the engine to force the engine to directly play a bestmove so it can end the tournament.
you can see an example of it in the log file output here:
after i press ctrl+c and it outputs finished tournament, it lets the engine still run and only when the engine outputs a bestmove does it save results and quit.