lichess-bot-devs / lichess-bot

A bridge between Lichess bots and chess engines
GNU Affero General Public License v3.0
760 stars 443 forks source link

Bot goes offline regularly after running for a while #151

Closed jlherren closed 5 years ago

jlherren commented 5 years ago

My bot fails with an error after having been running for a while (usually days). I see this error:

2018-11-30 05:33:52,710: Backing off watch_control_stream(...) for 1.0s (requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read)))
2018-11-30 05:34:23,104: Giving up watch_control_stream(...) after 2 tries (json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0))
Process Process-2:
Traceback (most recent call last):
  File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "<path-to-my-venv>/lib/python3.5/site-packages/backoff/_sync.py", line 99, in retry
    ret = target(*args, **kwargs)
  File "lichess-bot.py", line 44, in watch_control_stream
    event = json.loads(evnt.decode('utf-8'))
  File "/usr/lib/python3.5/json/__init__.py", line 319, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.5/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.5/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

This is not related to any games, as it happens hours after it played the last game. After this error the bot doesn't show as being online anymore, but lichess-bot doesn't exit either. Hitting Ctrl+C doesn't make it exit, I have to kill it. Any help appreciated.

ddugovic commented 5 years ago

On my bot, I observed right before that: Nov 29 23:33:49 vps178028 python3[4905]: *** https://lichess.org/uUT2lJLo/black [spectator] lichess: b'Lichess has been updated! Sorry for the inconvenience.'

gbtami commented 5 years ago

Maybe https://github.com/careless25/lichess-bot/pull/152 can solve this and other previous ChunkedEncodingError related issues.

jlherren commented 5 years ago

I just deployed #152 and will observe if that helps, thanks!

ddugovic commented 5 years ago

Now that I think more about it, this sort of "Lichess is down" error should probably kill the entire application, although I'm unsure how to do that... https://stackoverflow.com/questions/32053618/how-to-to-terminate-process-using-pythons-multiprocessing

jlherren commented 5 years ago

FWIW, #152 did not help. Instead of the previous error I just get that new one, but the bot still goes offline.

Is terminating the app really a good way to deal with these connection issues? It seems that the lichess servers have such problems on a regular basis and they always seem to be temporary (they also happen once in a while while browsing lichess). It's a pity to have an idle bot terminate because of that. Or to have a running game time out because of those 504 or 400 errors while sending moves to the servers. Is it not possible to just retry requests after a few seconds if they fail?

gbtami commented 5 years ago

Thx for the feedback. I closed #152. Can you paste the new error here you got with it, please? Maybe it can provide some other info abut this issue...

jlherren commented 5 years ago

The new error is the very error message added by #152, e.g.: 2018-12-03 05:45:05,664: Restarting to whatch control stream due to incomplete read error.

After that the bot was offline and the application unresponsive to Ctrl+C as usual.

ddugovic commented 5 years ago

My concern with retrying requests is that such could be perceived as a denial-of-service attack.

I run lichess-bot as a service, which restarts (in a throttled manner) after termination:

[Unit]
Description=Job that runs the lichess-bot daemon
After=network-online.target
Wants=network-online.target

[Service]
User=lichess
Group=lichess
WorkingDirectory=/home/lichess/lichess-bot
ExecStart=/usr/bin/python3 -u /home/lichess/lichess-bot/lichess-bot.py
Restart=always

[Install]
WantedBy=multi-user.target
gbtami commented 5 years ago

Interesting. #152 worked for me today:

2018-12-03T13:31:10.415513+00:00 app[worker.1]: 2018-12-03 13:31:10,415: https://lichess.org:443 "POST /api/bot/game/qtyDiB59/move/e6c5 HTTP/1.1" 200 None

2018-12-03T13:31:10.600309+00:00 app[worker.1]: 2018-12-03 13:31:10,599: --- https://lichess.org/qtyDiB59/white Game over

2018-12-03T13:31:10.606323+00:00 app[worker.1]: 2018-12-03 13:31:10,606: <PopenProcess at 0x7f8f808fcfd0 (pid=420)> << quit

2018-12-03T14:07:55.007538+00:00 app[worker.1]: 2018-12-03 14:07:55,007: Restarting to whatch control stream due to incomplete read error.

2018-12-03T14:07:55.018348+00:00 app[worker.1]: 2018-12-03 14:07:55,010: Starting new HTTPS connection (1): lichess.org:443

2018-12-03T14:08:58.249006+00:00 app[worker.1]: 2018-12-03 14:08:58,248: Backing off watch_control_stream(...) for 0.8s (requests.exceptions.ConnectionError: ('Connection aborted.', OSError(107, 'Transport endpoint is not connected')))

2018-12-03T14:08:59.003505+00:00 app[worker.1]: 2018-12-03 14:08:59,003: Starting new HTTPS connection (1): lichess.org:443

2018-12-03T14:10:04.811847+00:00 app[worker.1]: 2018-12-03 14:10:04,811: https://lichess.org:443 "GET /api/stream/event HTTP/1.1" 200 None

2018-12-03T14:15:03.185993+00:00 app[worker.1]: 2018-12-03 14:15:03,185: Resetting dropped connection: lichess.org

2018-12-03T14:15:03.686410+00:00 app[worker.1]: 2018-12-03 14:15:03,686: https://lichess.org:443 "POST /api/challenge/fwwZaJ0x/accept HTTP/1.1" 200 None

2018-12-03T14:15:03.693225+00:00 app[worker.1]: 2018-12-03 14:15:03,692:     Accept Bullet casual challenge from gbtami(1928)

2018-12-03T14:15:03.693923+00:00 app[worker.1]: 2018-12-03 14:15:03,693: --- Process Queue. Total Queued: 1. Total Used: 0

2018-12-03T14:15:03.698206+00:00 app[worker.1]: 2018-12-03 14:15:03,697: --- Process Used. Total Queued: 0. Total Used: 1

2018-12-03T14:15:03.714285+00:00 app[worker.1]: 2018-12-03 14:15:03,714: Starting new HTTPS connection (1): lichess.org:443

2018-12-03T14:15:04.087554+00:00 app[worker.1]: 2018-12-03 14:15:04,087: https://lichess.org:443 "GET /api/bot/game/stream/fwwZaJ0x HTTP/1.1" 200 None

2018-12-03T14:15:04.111829+00:00 app[worker.1]: 2018-12-03 14:15:04,111: <PopenProcess at 0x7f8f80901c18 (pid=450)> << xboard

2018-12-03T14:15:04.112092+00:00 app[worker.1]: 2018-12-03 14:15:04,111: <PopenProcess at 0x7f8f80901c18 (pid=450)> << protover 2

2018-12-03T14:15:04.117263+00:00 app[worker.1]: 2018-12-03 14:15:04,116: <PopenProcess at 0x7f8f80901c18 (pid=450)> << post

2018-12-03T14:15:04.118977+00:00 app[worker.1]: 2018-12-03 14:15:04,118: <PopenProcess at 0x7f8f80901c18 (pid=450)> << easy

2018-12-03T14:15:04.126386+00:00 app[worker.1]: 2018-12-03 14:15:04,126: <PopenProcess at 0x7f8f80901c18 (pid=450)> << ping 123

2018-12-03T14:15:05.300438+00:00 app[worker.1]: 2018-12-03 14:15:05,300: <PopenProcess at 0x7f8f80901c18 (pid=450)> >> # PyChess 0.99.2 [python3 3.6.4]

2018-12-03T14:15:05.300598+00:00 app[worker.1]: 2018-12-03 14:15:05,300: <PopenProcess at 0x7f8f80901c18 (pid=450)> >> feature ping=1 setboard=1 playother=1 san=1 usermove=1 time=1 draw=1 sigint=0 sigterm=0 reuse=1 analyze=1 myname="PyChess 0.99.2" variants="normal,wildcastle,nocastle,fischerandom,crazyhouse,losers,suicide,giveaway,horde,atomic,racingkings,kingofthehill,3check,placement,asean,cambodian,makruk,sittuyin" colors=0 ics=0 name=0 pause=0 nps=0 debug=1 memory=0 smp=0 egt="gaviota" option="skipPruneChance -slider 0 0 100"

2018-12-03T14:15:05.347662+00:00 app[worker.1]: 2018-12-03 14:15:05,347: <PopenProcess at 0x7f8f80901c18 (pid=450)> >> feature done=1

2018-12-03T14:15:05.349138+00:00 app[worker.1]: 2018-12-03 14:15:05,347: <PopenProcess at 0x7f8f80901c18 (pid=450)> >> pong 123

2018-12-03T14:15:05.349758+00:00 app[worker.1]: 2018-12-03 14:15:05,349: <PopenProcess at 0x7f8f80901c18 (pid=450)> << egtpath gaviota ../Gaviota/Gaviota

2018-12-03T14:15:05.350188+00:00 app[worker.1]: 2018-12-03 14:15:05,350: <PopenProcess at 0x7f8f80901c18 (pid=450)> << force

2018-12-03T14:15:05.351156+00:00 app[worker.1]: 2018-12-03 14:15:05,351: <PopenProcess at 0x7f8f80901c18 (pid=450)> << setboard rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

2018-12-03T14:15:05.351718+00:00 app[worker.1]: 2018-12-03 14:15:05,351: +++ https://lichess.org/fwwZaJ0x/white Bullet vs gbtami(1928)

2018-12-03T14:15:05.356334+00:00 app[worker.1]: 2018-12-03 14:15:05,356: Got move d2d4 from book engines/Perfect2017.bin
gbtami commented 5 years ago

Seems there are different errors causing ChunkedEncodingError. Some are recoverable, others not.

ddugovic commented 5 years ago

Likewise #153 somehow did not terminate the program (if it had, the OS service manager would have attempted to restart it without contending with other system resources):

Dec 03 07:17:11 vps178028 systemd[1]: Started Job that runs the lichess-bot daemon.
Dec 03 07:17:12 vps178028 python3[14060]: 2018-12-03 07:17:12,877: .   _/|
Dec 03 07:17:12 vps178028 python3[14060]: .  // o\
Dec 03 07:17:12 vps178028 python3[14060]: .  || ._)  lichess-bot 1.1.0
Dec 03 07:17:12 vps178028 python3[14060]: .  //__\
Dec 03 07:17:12 vps178028 python3[14060]: .  )___(   Play on Lichess with a bot
Dec 03 07:17:13 vps178028 python3[14060]: 2018-12-03 07:17:13,417: Welcome GodelEscherBot!
Dec 03 07:17:13 vps178028 python3[14060]: 2018-12-03 07:17:13,418: Engine target path: ./engines/stockfish
Dec 03 07:17:14 vps178028 python3[14060]: 2018-12-03 07:17:14,887: Looking up stockfish-x86_64 ...
Dec 03 07:17:15 vps178028 python3[14060]: 2018-12-03 07:17:15,078: Local stockfish-x86_64 is newer than release
Dec 03 07:17:15 vps178028 python3[14060]: 2018-12-03 07:17:15,079: You're now connected to https://lichess.org/ and awaiting challenges.
Dec 03 08:00:04 vps178028 python3[14060]: 2018-12-03 08:00:04,738:     Accept Bullet rated challenge from BOT Sugarbot-raspi3(2760)
Dec 03 08:00:04 vps178028 python3[14060]: 2018-12-03 08:00:04,742: --- Process Queue. Total Queued: 1. Total Used: 0
Dec 03 08:00:04 vps178028 python3[14060]: 2018-12-03 08:00:04,743: --- Process Used. Total Queued: 0. Total Used: 1
Dec 03 08:00:05 vps178028 python3[14060]: 2018-12-03 08:00:05,457: +++ https://lichess.org/gpPaGmSb/black Bullet vs BOT Sugarbot-raspi3(2760)
Dec 03 08:01:05 vps178028 python3[14060]: 2018-12-03 08:01:05,585:     Aborting https://lichess.org/gpPaGmSb/black by lack of activity
Dec 03 08:01:05 vps178028 python3[14060]: 2018-12-03 08:01:05,990: --- https://lichess.org/gpPaGmSb/black Game over
Dec 03 08:01:06 vps178028 python3[14060]: 2018-12-03 08:01:06,016: +++ Process Free. Total Queued: 0. Total Used: 0
Dec 03 08:01:32 vps178028 python3[14060]: 2018-12-03 08:01:32,853:     Accept Bullet rated challenge from BOT Sugarbot-raspi3(2760)
Dec 03 08:01:32 vps178028 python3[14060]: 2018-12-03 08:01:32,856: --- Process Queue. Total Queued: 1. Total Used: 0
Dec 03 08:01:32 vps178028 python3[14060]: 2018-12-03 08:01:32,857: --- Process Used. Total Queued: 0. Total Used: 1
Dec 03 08:01:33 vps178028 python3[14060]: 2018-12-03 08:01:33,351: +++ https://lichess.org/R5N3AsWp/black Bullet vs BOT Sugarbot-raspi3(2760)
Dec 03 08:01:54 vps178028 python3[14060]: 2018-12-03 08:01:54,402: --- https://lichess.org/R5N3AsWp/black Game over
Dec 03 08:01:54 vps178028 python3[14060]: 2018-12-03 08:01:54,421: +++ Process Free. Total Queued: 0. Total Used: 0
Dec 03 08:03:26 vps178028 python3[14060]: 2018-12-03 08:03:26,596:     Accept Bullet rated challenge from BOT Sugarbot-raspi3(2765)
Dec 03 08:03:26 vps178028 python3[14060]: 2018-12-03 08:03:26,600: --- Process Queue. Total Queued: 1. Total Used: 0
Dec 03 08:03:26 vps178028 python3[14060]: 2018-12-03 08:03:26,622: --- Process Used. Total Queued: 0. Total Used: 1
Dec 03 08:03:27 vps178028 python3[14060]: 2018-12-03 08:03:27,095: +++ https://lichess.org/ivcgbWra/black Bullet vs BOT Sugarbot-raspi3(2765)
Dec 03 08:03:47 vps178028 python3[14060]: 2018-12-03 08:03:47,559: --- https://lichess.org/ivcgbWra/black Game over
Dec 03 08:03:47 vps178028 python3[14060]: 2018-12-03 08:03:47,577: +++ Process Free. Total Queued: 0. Total Used: 0
Dec 03 09:07:53 vps178028 python3[14060]: 2018-12-03 09:07:53,459: Terminating client due to connection error
Dec 03 09:07:53 vps178028 python3[14060]: Traceback (most recent call last):
Dec 03 09:07:53 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/contrib/pyopenssl.py", line 294, in recv_into
Dec 03 09:07:53 vps178028 python3[14060]:     return self.connection.recv_into(*args, **kwargs)
Dec 03 09:07:53 vps178028 python3[14060]:   File "/usr/lib/python3/dist-packages/OpenSSL/SSL.py", line 1717, in recv_into
Dec 03 09:07:53 vps178028 python3[14060]:     self._raise_ssl_error(self._ssl, result)
Dec 03 09:07:53 vps178028 python3[14060]:   File "/usr/lib/python3/dist-packages/OpenSSL/SSL.py", line 1540, in _raise_ssl_error
Dec 03 09:07:53 vps178028 python3[14060]:     raise SysCallError(errno, errorcode.get(errno))
Dec 03 09:07:53 vps178028 python3[14060]: OpenSSL.SSL.SysCallError: (104, 'ECONNRESET')
Dec 03 09:07:53 vps178028 python3[14060]: During handling of the above exception, another exception occurred:
Dec 03 09:07:53 vps178028 python3[14060]: Traceback (most recent call last):
Dec 03 09:07:53 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/response.py", line 360, in _error_catcher
Dec 03 09:07:53 vps178028 python3[14060]:     yield
Dec 03 09:07:53 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/response.py", line 666, in read_chunked
Dec 03 09:07:53 vps178028 python3[14060]:     self._update_chunk_length()
Dec 03 09:07:53 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/response.py", line 598, in _update_chunk_length
Dec 03 09:07:53 vps178028 python3[14060]:     line = self._fp.fp.readline()
Dec 03 09:07:53 vps178028 python3[14060]:   File "/usr/lib/python3.6/socket.py", line 586, in readinto
Dec 03 09:07:53 vps178028 python3[14060]:     return self._sock.recv_into(b)
Dec 03 09:07:53 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/contrib/pyopenssl.py", line 299, in recv_into
Dec 03 09:07:53 vps178028 python3[14060]:     raise SocketError(str(e))
Dec 03 09:07:53 vps178028 python3[14060]: OSError: (104, 'ECONNRESET')
Dec 03 09:07:53 vps178028 python3[14060]: During handling of the above exception, another exception occurred:
Dec 03 09:07:53 vps178028 python3[14060]: Traceback (most recent call last):
Dec 03 09:07:53 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/requests/models.py", line 750, in generate
Dec 03 09:07:53 vps178028 python3[14060]:     for chunk in self.raw.stream(chunk_size, decode_content=True):
Dec 03 09:07:53 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/response.py", line 490, in stream
Dec 03 09:07:53 vps178028 python3[14060]:     for line in self.read_chunked(amt, decode_content=decode_content):
Dec 03 09:07:53 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/response.py", line 694, in read_chunked
Dec 03 09:07:53 vps178028 python3[14060]:     self._original_response.close()
Dec 03 09:07:53 vps178028 python3[14060]:   File "/usr/lib/python3.6/contextlib.py", line 99, in __exit__
Dec 03 09:07:53 vps178028 python3[14060]:     self.gen.throw(type, value, traceback)
Dec 03 09:07:53 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/response.py", line 378, in _error_catcher
Dec 03 09:07:53 vps178028 python3[14060]:     raise ProtocolError('Connection broken: %r' % e, e)
Dec 03 09:07:53 vps178028 python3[14060]: urllib3.exceptions.ProtocolError: ('Connection broken: OSError("(104, \'ECONNRESET\')",)', OSError("(104, 'ECONNRESET')",))
Dec 03 09:07:53 vps178028 python3[14060]: During handling of the above exception, another exception occurred:
Dec 03 09:07:53 vps178028 python3[14060]: Traceback (most recent call last):
Dec 03 09:07:53 vps178028 python3[14060]:   File "/home/turing/lichess-bot/lichess-bot.py", line 45, in watch_control_stream
Dec 03 09:07:53 vps178028 python3[14060]:     for line in response.iter_lines():
Dec 03 09:07:53 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/requests/models.py", line 794, in iter_lines
Dec 03 09:07:53 vps178028 python3[14060]:     for chunk in self.iter_content(chunk_size=chunk_size, decode_unicode=decode_unicode):
Dec 03 09:07:53 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/requests/models.py", line 753, in generate
Dec 03 09:07:53 vps178028 python3[14060]:     raise ChunkedEncodingError(e)
Dec 03 09:07:53 vps178028 python3[14060]: requests.exceptions.ChunkedEncodingError: ('Connection broken: OSError("(104, \'ECONNRESET\')",)', OSError("(104, 'ECONNRESET')",))
Dec 03 09:07:53 vps178028 python3[14060]: 2018-12-03 09:07:53,486: Backing off watch_control_stream(...) for 0.9s (SystemExit)
Dec 03 09:08:01 vps178028 python3[14060]: 2018-12-03 09:08:01,085: Terminating client due to connection error
Dec 03 09:08:01 vps178028 python3[14060]: Traceback (most recent call last):
Dec 03 09:08:01 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/contrib/pyopenssl.py", line 294, in recv_into
Dec 03 09:08:01 vps178028 python3[14060]:     return self.connection.recv_into(*args, **kwargs)
Dec 03 09:08:01 vps178028 python3[14060]:   File "/usr/lib/python3/dist-packages/OpenSSL/SSL.py", line 1717, in recv_into
Dec 03 09:08:01 vps178028 python3[14060]:     self._raise_ssl_error(self._ssl, result)
Dec 03 09:08:01 vps178028 python3[14060]:   File "/usr/lib/python3/dist-packages/OpenSSL/SSL.py", line 1540, in _raise_ssl_error
Dec 03 09:08:01 vps178028 python3[14060]:     raise SysCallError(errno, errorcode.get(errno))
Dec 03 09:08:01 vps178028 python3[14060]: OpenSSL.SSL.SysCallError: (104, 'ECONNRESET')
Dec 03 09:08:01 vps178028 python3[14060]: During handling of the above exception, another exception occurred:
Dec 03 09:08:01 vps178028 python3[14060]: Traceback (most recent call last):
Dec 03 09:08:01 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/response.py", line 360, in _error_catcher
Dec 03 09:08:01 vps178028 python3[14060]:     yield
Dec 03 09:08:01 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/response.py", line 666, in read_chunked
Dec 03 09:08:01 vps178028 python3[14060]:     self._update_chunk_length()
Dec 03 09:08:01 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/response.py", line 598, in _update_chunk_length
Dec 03 09:08:01 vps178028 python3[14060]:     line = self._fp.fp.readline()
Dec 03 09:08:01 vps178028 python3[14060]:   File "/usr/lib/python3.6/socket.py", line 586, in readinto
Dec 03 09:08:01 vps178028 python3[14060]:     return self._sock.recv_into(b)
Dec 03 09:08:01 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/contrib/pyopenssl.py", line 299, in recv_into
Dec 03 09:08:01 vps178028 python3[14060]:     raise SocketError(str(e))
Dec 03 09:08:01 vps178028 python3[14060]: OSError: (104, 'ECONNRESET')
Dec 03 09:08:01 vps178028 python3[14060]: During handling of the above exception, another exception occurred:
Dec 03 09:08:01 vps178028 python3[14060]: Traceback (most recent call last):
Dec 03 09:08:01 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/requests/models.py", line 750, in generate
Dec 03 09:08:01 vps178028 python3[14060]:     for chunk in self.raw.stream(chunk_size, decode_content=True):
Dec 03 09:08:01 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/response.py", line 490, in stream
Dec 03 09:08:01 vps178028 python3[14060]:     for line in self.read_chunked(amt, decode_content=decode_content):
Dec 03 09:08:01 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/response.py", line 694, in read_chunked
Dec 03 09:08:01 vps178028 python3[14060]:     self._original_response.close()
Dec 03 09:08:01 vps178028 python3[14060]:   File "/usr/lib/python3.6/contextlib.py", line 99, in __exit__
Dec 03 09:08:01 vps178028 python3[14060]:     self.gen.throw(type, value, traceback)
Dec 03 09:08:01 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/urllib3/response.py", line 378, in _error_catcher
Dec 03 09:08:01 vps178028 python3[14060]:     raise ProtocolError('Connection broken: %r' % e, e)
Dec 03 09:08:01 vps178028 python3[14060]: urllib3.exceptions.ProtocolError: ('Connection broken: OSError("(104, \'ECONNRESET\')",)', OSError("(104, 'ECONNRESET')",))
Dec 03 09:08:01 vps178028 python3[14060]: During handling of the above exception, another exception occurred:
Dec 03 09:08:01 vps178028 python3[14060]: Traceback (most recent call last):
Dec 03 09:08:01 vps178028 python3[14060]:   File "/home/turing/lichess-bot/lichess-bot.py", line 45, in watch_control_stream
Dec 03 09:08:01 vps178028 python3[14060]:     for line in response.iter_lines():
Dec 03 09:08:01 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/requests/models.py", line 794, in iter_lines
Dec 03 09:08:01 vps178028 python3[14060]:     for chunk in self.iter_content(chunk_size=chunk_size, decode_unicode=decode_unicode):
Dec 03 09:08:01 vps178028 python3[14060]:   File "/home/turing/.local/lib/python3.6/site-packages/requests/models.py", line 753, in generate
Dec 03 09:08:01 vps178028 python3[14060]:     raise ChunkedEncodingError(e)
Dec 03 09:08:01 vps178028 python3[14060]: requests.exceptions.ChunkedEncodingError: ('Connection broken: OSError("(104, \'ECONNRESET\')",)', OSError("(104, 'ECONNRESET')",))
Dec 03 09:08:01 vps178028 python3[14060]: 2018-12-03 09:08:01,089: Giving up watch_control_stream(...) after 2 tries (SystemExit)
gbtami commented 5 years ago

I think sys.exit() terminates the watch control stream subprocess only. If you want to terminate the main program you should set terminated also.

I'm not convinced that we should exit on ChunkedEncodingError exeption. Backoff will retry using exponential growing times, so it will not trigger any rate limit IMO. Maybe we should just try to set max_time to a higher value.

ddugovic commented 5 years ago

Ah, if backoff does in fact throttle the requests (I don't know how we verify that in the case like #152 where you add a recursive call, which I assume executes immediately) then yes, that seems preferable to terminating the application.

gbtami commented 5 years ago

Forget #152 , please! It was a bad idea :)

jlherren commented 5 years ago

Somehow this is still not fixed. I see that it tries to exit but at the end it just doesn't.

2018-12-17 05:47:31 __main__[448] ERROR Terminating client due to connection error
Traceback (most recent call last):
  File "<path-to-venv>/lib/python3.5/site-packages/urllib3/response.py", line 601, in _update_chunk_length
    self.chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: b''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<path-to-venv>/lib/python3.5/site-packages/urllib3/response.py", line 360, in _error_catcher
    yield
  File "<path-to-venv>/lib/python3.5/site-packages/urllib3/response.py", line 666, in read_chunked
    self._update_chunk_length()
  File "<path-to-venv>/lib/python3.5/site-packages/urllib3/response.py", line 605, in _update_chunk_length
    raise httplib.IncompleteRead(line)
http.client.IncompleteRead: IncompleteRead(0 bytes read)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<path-to-venv>/lib/python3.5/site-packages/requests/models.py", line 750, in generate
    for chunk in self.raw.stream(chunk_size, decode_content=True):
  File "<path-to-venv>/lib/python3.5/site-packages/urllib3/response.py", line 490, in stream
    for line in self.read_chunked(amt, decode_content=decode_content):
  File "<path-to-venv>/lib/python3.5/site-packages/urllib3/response.py", line 694, in read_chunked
    self._original_response.close()
  File "/usr/lib/python3.5/contextlib.py", line 77, in __exit__
    self.gen.throw(type, value, traceback)
  File "<path-to-venv>/lib/python3.5/site-packages/urllib3/response.py", line 378, in _error_catcher
    raise ProtocolError('Connection broken: %r' % e, e)
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "lichess-bot.py", line 55, in watch_control_stream
    for line in response.iter_lines():
  File "<path-to-venv>/lib/python3.5/site-packages/requests/models.py", line 794, in iter_lines
    for chunk in self.iter_content(chunk_size=chunk_size, decode_unicode=decode_unicode):
  File "<path-to-venv>/lib/python3.5/site-packages/requests/models.py", line 753, in generate
    raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))
2018-12-17 05:47:32 backoff[448] INFO Backing off watch_control_stream(...) for 0.5s (SystemExit)
2018-12-17 05:47:54 backoff[448] ERROR Giving up watch_control_stream(...) after 2 tries (json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0))
Process Process-2:
Traceback (most recent call last):
  File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "<path-to-venv>/lib/python3.5/site-packages/backoff/_sync.py", line 99, in retry
    ret = target(*args, **kwargs)
  File "lichess-bot.py", line 57, in watch_control_stream
    event = json.loads(line.decode('utf-8'))
  File "/usr/lib/python3.5/json/__init__.py", line 319, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.5/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.5/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
jlherren commented 5 years ago

It seems that it's because of those backoff.on_exception(..., BaseException, ...). Normally BaseException should never be caught, because that's way too broad: It even catches the SystemExit exception which is probably why it doesn't exit.