nmlorg / ntelebot

1 stars 0 forks source link

Support TBA restarts/transient errors more gracefully #9

Open nmlorg opened 3 weeks ago

nmlorg commented 3 weeks ago

Despite what I recently suggested, TBA does go wonky every now and then, flooding the logs with both:

2024-08-13 20:16:49,251 ERROR loop.py:55] Ignoring uncaught error while polling:
Traceback (most recent call last):
  File "lib/python3.10/site-packages/ntelebot/loop.py", line 45, in _poll_bot
    updates = bot.get_updates(offset=offset, timeout=timeout)
  File "lib/python3.10/site-packages/ntelebot/bot.py", line 75, in __call__
    raise ntelebot.errors.Error(data)
ntelebot.errors.Error: {'ok': False, 'error_code': 429, 'description': 'Too Many Requests: retry after 5', 'parameters': {'retry_after': 5}}

and:

2024-08-13 20:16:51,556 ERROR loop.py:55] Ignoring uncaught error while polling:
Traceback (most recent call last):
  File "lib/python3.10/site-packages/ntelebot/loop.py", line 45, in _poll_bot
    updates = bot.get_updates(offset=offset, timeout=timeout)
  File "lib/python3.10/site-packages/ntelebot/bot.py", line 75, in __call__
    raise ntelebot.errors.Error(data)
ntelebot.errors.Error: {'ok': False, 'error_code': 502, 'description': 'Bad Gateway'}

mixed together. (I assume the latter causes rapid retries that trigger the former.)

To date, these only seem to have happened during getUpdates polls (which are designed to restart every 10 seconds no matter what), but I think I'd like to go ahead and retry at the ntelebot (versus metabot) level just in case this ever kills an actual mutation.

I'm not sure if this should be done at the requests level (that applies rate limits to things like api.weather.gov too), or just to ntelebot.bot._Request.__call__.

nmlorg commented 3 weeks ago

My test bot's 6 a.m. announcement didn't come through until 6:10 today.

2024-08-21 06:00:02,396 INFO reminders.py:42] Running periodic.
⋮
2024-08-21 06:00:14,419 INFO reminders.py:166] Sending reminder to -40….
2024-08-21 06:00:14,419 INFO bot.py:57] 3980803 https://api.telegram.org/bot44…/sendphoto {'chat_id': '-40…', 'photo': 'https://ssl.gstatic.com/calendar/…', 'caption': 'There are a bunch of events coming up:\n\n<b>…</a>', 'parse_mode': 'HTML', 'disable_notification': True}
2024-08-21 06:00:14,420 INFO requests.py:15] 3980803 send https://api.telegram.org/bot44…/sendphoto {'timeout': 1202, 'allow_redirects': True, 'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
2024-08-21 06:00:14,580 ERROR loop.py:72] Ignoring uncaught error while dispatching:
Traceback (most recent call last):
  File "lib/python3.11/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/urllib3/connectionpool.py", line 536, in _make_request
    response = conn.getresponse()
               ^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/urllib3/connection.py", line 464, in getresponse
    httplib_response = super().getresponse()
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/http/client.py", line 1374, in getresponse
    response.begin()
  File "/usr/lib/python3.11/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "lib/python3.11/site-packages/requests/adapters.py", line 667, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/urllib3/connectionpool.py", line 843, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/urllib3/util/retry.py", line 474, in increment
    raise reraise(type(error), error, _stacktrace)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/urllib3/util/util.py", line 38, in reraise
    raise value.with_traceback(tb)
  File "lib/python3.11/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/urllib3/connectionpool.py", line 536, in _make_request
    response = conn.getresponse()
               ^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/urllib3/connection.py", line 464, in getresponse
    httplib_response = super().getresponse()
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/http/client.py", line 1374, in getresponse
    response.begin()
  File "/usr/lib/python3.11/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ntelebot/ntelebot/loop.py", line 70, in run
    callback()
  File "metabot/metabot/modules/reminders.py", line 45, in periodic
    _daily_messages(multibot, records)
  File "metabot/metabot/modules/reminders.py", line 119, in _daily_messages
    message = reminder_send(bot, groupid, text, url)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "metabot/metabot/modules/reminders.py", line 170, in reminder_send
    return bot.send_photo(chat_id=groupid,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "ntelebot/ntelebot/bot.py", line 59, in __call__
    data = ntelebot.requests.post(self.url, timeout=self.timeout, **_prepare(params)).json()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "ntelebot/ntelebot/requests.py", line 49, in post
    return _LOCAL.session.post(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/requests/sessions.py", line 637, in post
    return self.request("POST", url, data=data, json=json, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "ntelebot/ntelebot/requests.py", line 17, in send
    return super().send(request, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/requests/adapters.py", line 682, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
⋮
2024-08-21 06:10:03,509 INFO reminders.py:42] Running periodic.
2024-08-21 06:10:06,357 INFO bot.py:57] 3980804 https://api.telegram.org/bot44…/getupdates {'offset': 654822478, 'timeout': 1200}
2024-08-21 06:10:06,357 INFO requests.py:15] 3980804 send https://api.telegram.org/bot44…/getupdates {'timeout': 1202, 'allow_redirects': True, 'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
2024-08-21 06:10:10,510 INFO requests.py:15] 3980803 send https://api.weather.gov/alerts/active?zone=… {'timeout': 10, 'allow_redirects': True, 'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
2024-08-21 06:10:10,510 INFO connectionpool.py:291] Resetting dropped connection: api.weather.gov
2024-08-21 06:10:10,977 INFO reminders.py:166] Sending reminder to -40….
2024-08-21 06:10:10,977 INFO bot.py:57] 3980803 https://api.telegram.org/bot44…/sendphoto {'chat_id': '-40…', 'photo': 'https://ssl.gstatic.com/calendar/…', 'caption': 'There are a bunch of events coming up:\n\n<b>…</a>', 'parse_mode': 'HTML', 'disable_notification': True}
2024-08-21 06:10:10,978 INFO requests.py:15] 3980803 send https://api.telegram.org/bot44…/sendphoto {'timeout': 1202, 'allow_redirects': True, 'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
2024-08-21 06:10:10,978 INFO connectionpool.py:1051] Starting new HTTPS connection (8): api.telegram.org:443
2024-08-21 06:10:11,785 INFO reminders.py:192] Editing reminder -40…/8137.
2024-08-21 06:10:11,785 INFO bot.py:57] 3980803 https://api.telegram.org/bot44…/editmessagetext {'chat_id': -40…, 'message_id': 8137, 'text': 'There are a bunch of events coming up:\n\n<b>…</a>', 'parse_mode': 'HTML', 'disable_web_page_preview': True}
2024-08-21 06:10:11,786 INFO requests.py:15] 3980803 send https://api.telegram.org/bot44…/editmessagetext {'timeout': 1202, 'allow_redirects': True, 'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}

  This might be a side effect of 0045680 causing HTTPS sessions to be kept alive and reused. (Note "Starting new HTTPS connection" versus "Resetting dropped connection".)

nmlorg commented 3 weeks ago

It looks like the production instance got the same error a single time during a poll a little earlier:

2024-08-21 05:46:22,780 ERROR loop.py:55] Ignoring uncaught error while polling:
Traceback (most recent call last):
  File "lib/python3.10/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
  File "lib/python3.10/site-packages/urllib3/connectionpool.py", line 536, in _make_request
    response = conn.getresponse()
  File "lib/python3.10/site-packages/urllib3/connection.py", line 464, in getresponse
    httplib_response = super().getresponse()
  File "/usr/lib/python3.10/http/client.py", line 1375, in getresponse
    response.begin()
  File "/usr/lib/python3.10/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.10/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "lib/python3.10/site-packages/requests/adapters.py", line 667, in send
    resp = conn.urlopen(
  File "lib/python3.10/site-packages/urllib3/connectionpool.py", line 843, in urlopen
    retries = retries.increment(
  File "lib/python3.10/site-packages/urllib3/util/retry.py", line 474, in increment
    raise reraise(type(error), error, _stacktrace)
  File "lib/python3.10/site-packages/urllib3/util/util.py", line 38, in reraise
    raise value.with_traceback(tb)
  File "lib/python3.10/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
  File "lib/python3.10/site-packages/urllib3/connectionpool.py", line 536, in _make_request
    response = conn.getresponse()
  File "lib/python3.10/site-packages/urllib3/connection.py", line 464, in getresponse
    httplib_response = super().getresponse()
  File "/usr/lib/python3.10/http/client.py", line 1375, in getresponse
    response.begin()
  File "/usr/lib/python3.10/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.10/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "lib/python3.10/site-packages/ntelebot/loop.py", line 45, in _poll_bot
    updates = bot.get_updates(offset=offset, timeout=timeout)
  File "lib/python3.10/site-packages/ntelebot/bot.py", line 56, in __call__
    data = ntelebot.requests.post(self.url, timeout=self.timeout, **_prepare(params)).json()
  File "lib/python3.10/site-packages/ntelebot/requests.py", line 38, in post
    return _LOCAL.session.post(*args, **kwargs)
  File "lib/python3.10/site-packages/requests/sessions.py", line 637, in post
    return self.request("POST", url, data=data, json=json, **kwargs)
  File "lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "lib/python3.10/site-packages/requests/adapters.py", line 682, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2024-08-21 05:46:24,116 INFO connectionpool.py:1051] Starting new HTTPS connection (4): api.telegram.org:443
nmlorg commented 3 weeks ago

(I'm now taking a detour of possibly switching from requests to httpx, both to switch to HTTP/2 and possibly moving from thread-per-poll loop to asyncio. The former would probably be a drop-in replacement, but the latter would be a major rewrite/API change.)