Closed OhJayGee closed 6 years ago
I've seen similar messages, but only the case where the opponent hastily aborts the game before the first move.
This happens because:
If anyone has a good idea of how to fix this, make a PR or I will get to it once I have some time.
I am rather seeing this in the middle of a game, not at the beginning... I'll try to get some examples with actual timestamps. Is there a way I can turn more verbose logging?
@OhJayGee you can pass the -v flag: python lichess-bot.py -v
for verbose logging
This can happen if game is finishing by any reason while BOT is thinking and then wants to li.make_move()
.
@careless25 maybe I'm wrong but I think 2.b. (Lichess send game ended notification) is a non existing thing. If this statement is correct I can imagine a workaround like this:
If HTTPError raised by li.make_move()
we can look at BOTs ongoing games (using "/api/account/playing" endpoint) and if our current game is not in the ongoing games list then we know it's over.
Otherwise re raise the exception.
What do you think, is this a usable idea or there is something better?
It seems the workaround is necessary in any event, since Lichess is online and messages take time to propagate.
newer observation: in the last week I did not observe any timeout flags caused by this error. Did anything change on the backend?
Thank you @gbtami , for linking bug #128. Here it is a log and a picture that express exactly that:
Here a complete log, since BOOT until error. Very small, because it is in the first move.
# [DEBUG file:xadreco.c line:596]: Starting optarg loop...
# DEBUG MACRO compiled value: 1
# Debug verbose level set at: 1
# play random: no. seed: -r 0
# connection: -c lichess
# wait: -x wait_xboard
# book: -b livro.txt
# Xadreco version 5.84 build 180626.010155 (C) 1994-2018, by Dr. Beco
# Xadreco comes with ABSOLUTELY NO WARRANTY;
# This is free software, and you are welcome to redistribute it
# under certain conditions; Please, visit http://www.fsf.org/licenses/gpl.html
# for details.
# scanf: xboard
#
# feature done=0
# feature myname="Xadreco 5.84"
# feature ping=0 setboard=1 playother=1 san=0 usermove=0 time=1 draw=1 sigint=0 sigterm=1 reuse=0 analyze=1 variants="normal" colors=0 ics=0 name=0 pause=0 nps=0 debug=1 memory=0 smp=0 exclude=0 setscore=0
# feature done=1
# scanf: protover
# xboard: ignoring protover
# scanf: 2
# xboard: ignoring 2
# scanf: post
# xboard: post. Xadreco will show what its thinking. (1)
# scanf: easy
# xboard: easy. Xadreco stop thinking. (1)
# scanf: ping
# scanf: 123
# pong 123
# scanf: force
# scanf: setboard
# xboard: setboard. Xadreco will set a board position.
# scanf: rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR
# xboard: rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR
# scanf: w
# xboard: w
# scanf: KQkq
# xboard: KQkq
# scanf: -
# xboard: -
# scanf: 0
# xboard: 0
# scanf: 1
# xboard: 1
# scanf: force
# xboard: force. Xadreco is in force mode.
# scanf: setboard
# xboard: setboard. Xadreco will set a board position.
# scanf: rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR
# xboard: rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR
# scanf: w
# xboard: w
# scanf: KQkq
# xboard: KQkq
# scanf: -
# xboard: -
# scanf: 0
# xboard: 0
# scanf: 1
# xboard: 1
# scanf: level
# scanf: 0
# scanf: 0:10
# scanf: 0
# xadreco level: 10.0s+0.0s por 60 lances: ajustado para st 0.166667 s por lance
# scanf: go
# xboard: go. Xadreco is now white.
# xadreco : Error. I don't know what to play... Playing a random move (compjoga)!
# move f2f4
ERROR:backoff:Giving up api_post(...) after 1 tries (requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://lichess.org/api/bot/game/QzqnWDGC/move/f2f4)
ERROR:backoff:Giving up play_game(...) after 12 tries (requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://lichess.org/api/bot/game/QzqnWDGC/move/f2f4)
Traceback (most recent call last):
File "/home/beco/Documents/fontes/python/lichess-bot/logging_pool.py", line 16, in __call__
result = self.__callable(*args, **kwargs)
File "/home/beco/Documents/fontes/python/lichess-bot/.venv/lib/python3.5/site-packages/backoff/_sync.py", line 99, in retry
ret = target(*args, **kwargs)
File "lichess-bot.py", line 126, in play_game
play_first_move(game, engine, board, li)
File "lichess-bot.py", line 173, in play_first_move
li.make_move(game.id, best_move)
File "/home/beco/Documents/fontes/python/lichess-bot/lichess.py", line 71, in make_move
return self.api_post(ENDPOINTS["move"].format(game_id, move))
File "/home/beco/Documents/fontes/python/lichess-bot/.venv/lib/python3.5/site-packages/backoff/_sync.py", line 99, in retry
ret = target(*args, **kwargs)
File "/home/beco/Documents/fontes/python/lichess-bot/lichess.py", line 61, in api_post
response.raise_for_status()
File "/home/beco/Documents/fontes/python/lichess-bot/.venv/lib/python3.5/site-packages/requests/models.py", line 935, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://lichess.org/api/bot/game/QzqnWDGC/move/f2f4
You can see, it is the start position. Xadreco sent f2f4
as first move by the time opponent had already aborted the game:
lichess-bot tried to forward this move to the site and got a 400 error
400 Client Error: Bad Request for url: https://lichess.org/api/bot/game/QzqnWDGC/move/f2f4
It seems that the bot needs to check if the game is still ongoing before trying to forward the move to the engine. The engine has no way to know that it don't need to move.
To the engine, just send a quit
command and it will rest again.
Arguably HTTP 409 Conflict
makes more sense than HTTP 400 Bad Request
due to the opponent "editing" the game by aborting it.
Either way, 12 retries in the face of an HTTP 4xx
response seems weird since those indicate Client Errors.
Hmm... maybe something like the following would help: https://github.com/litl/backoff/blob/a2a0132eed09d8e3e8329c09014e6689fd8ef4b1/README.rst#L95
Here another twist:
1. Engine has only 1 second before losing on time
2. Engine spends 3 seconds thinking
3. Web page gives victory and a rematch button is enabled, clicked, and no response
4. Log shows engine finally decided to play, bot gave 400 error and hangs
5. Bot asks engine to quit after error (probably it was on buffer already)
6. Engine quits and say good bye
7. Bot tries something else I don't get, and get a 404 (accept, probably when I clicked rematch)
Here the complete log:
# xadreco : N.80. Pretas. Tempo 3.000000s. Acumulado: 3.000000s. Hora: 2018-06-29 13:33:03
# xadreco : offer draw (1) value: -3.15
# move c1e3
# offer draw
# xadreco : N.81. Brancas. Tempo 178.000000s. Acumulado: 178.000000s. Hora: 2018-06-29 13:36:01
DEBUG:chess.engine:<PopenProcess at 0x75624770 (pid=28212)> >> move c1e3
DEBUG:chess.engine:<PopenProcess at 0x75624770 (pid=28212)> >> offer draw
DEBUG:urllib3.connectionpool:Resetting dropped connection: lichess.org
DEBUG:urllib3.connectionpool:https://lichess.org:443 "POST /api/bot/game/eNJKQTr6/move/c1e3 HTTP/1.1" 400 None
ERROR:backoff:Giving up api_post(...) after 1 tries (requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://lichess.org/api/bot/game/eNJKQTr6/move/c1e3)
Traceback (most recent call last):
File "lichess-bot.py", line 151, in play_game
li.make_move(game.id, best_move)
File "/home/beco/Documents/fontes/python/lichess-bot/lichess.py", line 71, in make_move
return self.api_post(ENDPOINTS["move"].format(game_id, move))
File "/home/beco/Documents/fontes/python/lichess-bot/.venv/lib/python3.5/site-packages/backoff/_sync.py", line 99, in retry
ret = target(*args, **kwargs)
File "/home/beco/Documents/fontes/python/lichess-bot/lichess.py", line 61, in api_post
response.raise_for_status()
File "/home/beco/Documents/fontes/python/lichess-bot/.venv/lib/python3.5/site-packages/requests/models.py", line 935, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://lichess.org/api/bot/game/eNJKQTr6/move/c1e3
DEBUG:chess.engine:<PopenProcess at 0x75624770 (pid=28212)> << quit
# scanf: quit
# xadreco : # Thanks for playing Xadreco.
# xadreco : sai ( 0 )
DEBUG:urllib3.connectionpool:Resetting dropped connection: lichess.org
DEBUG:urllib3.connectionpool:https://lichess.org:443 "POST /api/challenge/eNJKQTr6/accept HTTP/1.1" 404 None
ERROR:backoff:Giving up api_post(...) after 1 tries (requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://lichess.org/api/challenge/eNJKQTr6/accept)
DEBUG:urllib3.connectionpool:https://lichess.org:443 "POST /api/challenge/TgesCs9v/accept HTTP/1.1" 404 None
ERROR:backoff:Giving up api_post(...) after 1 tries (requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://lichess.org/api/challenge/TgesCs9v/accept)
So, it is kind of the same error, but instead of "abort" it was a game that ended due to normal reasons (timeout in this case).
Solution: Bot should see the game is ended, discard any output from engine and ask engine to quit.
@drbeco I agree, but isn't that more related to #131 ?
UCI specifies how much time the engine has remaining on the clock, and Stockfish already uses that clock time (via python-chess, via this library) to protect against a time loss.
Hi @ddugovic , I think it is not related to protocol.
Both Xboard and UCI specifiees how much time the engine has left on the clock. Xboard actually have many ways to deliver time to the engine.
To tell that the engine author should use better the time control and never loses on time is to displace responsibilities from what each program have, mixing things.
Of course I understand engine authors that would not even "imagine" an engine that may lose on time, or that will resign without a fight, or whatever. Most of engine authors simply tries to deliver the most "powerful" move, whatever that means.
Xadreco is much more like a human, it does lose on time, it resigns, it offer draws, if sacrifices without sound continuation; man, it even play random moves every now and them (literally "random").
So, if a engine loses on time, the communication bot should command it (whatever protocol) to stop.
force
(if reuse
is set) or a quit
(if xreuse
or feature reuse=0
) is set.stop
or quit
Also, if the engine reply with a move before it could process the command to stop, lichess-bot should just ignore it, instead of trying to post it and get an 404 error in a finished game.
@drbeco :
HTTP 4xx
-class "client error" response).Got a new incident:
https://lichess.org/ovJEbL91/white#130
ERROR:backoff:Backing off api_post(...) for 0.6s (requests.exceptions.HTTPError: 504 Server Error: Gateway Time-out for url: https://lichess.org/api/bot/game/ovJEbL91/move/d5b4) ERROR:backoff:Giving up api_post(...) after 2 tries (requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://lichess.org/api/bot/game/ovJEbL91/move/d5b4)
The question here is: did the bot lose on time and didn't know (like you guys talked about earlier on this thread?) - or did the bot lose on time because of 504 Server Error - wherever that came from?
In a 3'+1 game the bot normally should never lose on time as it can move within 20msec, plus 37ms ping time to the lichess server, and I have configured a 500ms overhead time, so 1 second is plenty. Yet 0.6s penalty for server error could have triggered loss on time.
I too (running in OVH!) have observed a single HTTP 504 response:
$ sudo journalctl -u lichess-bot.service | grep -A 2 -B 10 504
Jul 04 15:36:32 vps178028 python3[27801]: +++ Process Free. Total Queued: 0. Total Used: 0
Jul 04 15:36:46 vps178028 python3[27801]: Accept Bullet casual challenge from Purusa(2242)
Jul 04 15:36:46 vps178028 python3[27801]: --- Process Queue. Total Queued: 1. Total Used: 0
Jul 04 15:36:46 vps178028 python3[27801]: --- Process Used. Total Queued: 0. Total Used: 1
Jul 04 15:36:47 vps178028 python3[27801]: +++ https://lichess.org/LYwbnM4e/white Bullet vs Purusa(2242)
Jul 04 15:37:54 vps178028 python3[27801]: --- https://lichess.org/LYwbnM4e/white Game over
Jul 04 15:37:54 vps178028 python3[27801]: +++ Process Free. Total Queued: 0. Total Used: 0
Jul 04 16:15:00 vps178028 python3[27801]: Something went wrong. Game is starting and we don't have a queued process
Jul 04 16:15:00 vps178028 python3[27801]: --- Process Used. Total Queued: 0. Total Used: 1
Jul 04 16:15:00 vps178028 python3[27801]: +++ https://lichess.org/tfiUOVkq/white UltraBullet vs AI level 8
Jul 04 16:16:35 vps178028 python3[27801]: ERROR:backoff:Backing off api_post(...) for 0.6s (requests.exceptions.HTTPError: 504 Server Error: Gateway Time-out for url: https://lichess.org/api/bot/game/tfiUOVkq/move/g1g5)
Jul 04 16:16:37 vps178028 python3[27801]: ERROR:backoff:Giving up api_post(...) after 2 tries (requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://lichess.org/api/bot/game/tfiUOVkq/move/g1g5)
Jul 04 16:16:37 vps178028 python3[27801]: Abandoning game due to connection error
This is very likely a problem of Lichess itself, not of the bot. One of the database servers is slower than usual, leading to timeouts every now and then. We're working on it ...
Edit: The issue I am talking about started on the 23rd of June. Timeouts should be much less frequent before that.
I am seeing a number of errors from lichess-bot following the same pattern (see below). All of them lead to a abort/game over or loss on time.
Does anybody else see that? is that actual connectivity, because the URLs seem correct, and the machines' internet connectivity was fine as well...
best regards, OJG
+++ https://lichess.org/bG43Aj81/black Bullet vs BOT F_Me_Im_Famous(1500?) ERROR:backoff:Backing off api_post(...) for 0.9s (requests.exceptions.HTTPError: 504 Server Error: Gateway Time-out for url: https://lichess.org/api/bot/game/bG43Aj81/move/d1e2) ERROR:backoff:Giving up api_post(...) after 2 tries (requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://lichess.org/api/bot/game/bG43Aj81/move/d1e2) Abandoning game due to connection error Traceback (most recent call last): File "lichess-bot.py", line 150, in play_game li.make_move(game.id, best_move) File "/ChessEngines/lichess-bot/lichess.py", line 71, in make_move return self.api_post(ENDPOINTS["move"].format(game_id, move)) File "/ChessEngines/lichess-bot/.venv/lib/python3.6/site-packages/backoff/_sync.py", line 99, in retry ret = target(*args, **kwargs) File "/ChessEngines/lichess-bot/lichess.py", line 61, in api_post response.raise_for_status() File "/ChessEngines/lichess-bot/.venv/lib/python3.6/site-packages/requests/models.py", line 935, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://lichess.org/api/bot/game/bG43Aj81/move/d1e2 --- https://lichess.org/bG43Aj81/black Game over