PythonistaGuild / TwitchIO

An Async Bot/API wrapper for Twitch made in Python.
https://twitchio.dev
MIT License
785 stars 163 forks source link

Unexpected failure with error 'Websocket connection was closed: None' #438

Open returnDanilo opened 8 months ago

returnDanilo commented 8 months ago

Sometimes, with no apparent reason, my server stops working with the following errors:

Websocket connection was closed: None
Websocket connection was closed: None
The channel "foobar" was unable to be joined. Check the channel is valid.
Websocket connection was closed: None
The channel "foobar" was unable to be joined. Check the channel is valid.
Websocket connection was closed: None
Task exception was never retrieved
future: <Task finished name='Task-2560' coro=<WSConnection._join_future_handle() done, defined at /usr/local/lib/python3.9/dist-packages/twitchio/websocket.py:327> exception=KeyError('foobar')>
Traceback (most recent call last):
  File "/usr/lib/python3.9/asyncio/tasks.py", line 492, in wait_for
    fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/twitchio/websocket.py", line 329, in _join_future_handle
    await asyncio.wait_for(fut, timeout=timeout)
  File "/usr/lib/python3.9/asyncio/tasks.py", line 494, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/twitchio/websocket.py", line 332, in _join_future_handle
    self._join_pending.pop(channel)
Keyerror: 'foobar'
Websocket connection was closed: None
Websocket connection was closed: None
Websocket connection was closed: None
Websocket connection was closed: None
Websocket connection was closed: None
Websocket connection was closed: None
...

The websocket error line repeats many more times, but not the exception errors.

I've encountered this issue a few times. Usually it doesn't take 24 hours after the server is restarted until I see the error messages again. I'm using twichio 2.8.2 installed from pypi.

likel commented 6 months ago

I'm experiencing the same issue. Exactly the same, same error lines and everything.

Eclip5eLP commented 5 months ago

Having the exact same problem, same errors and lines. Happens every ~48 hours for me.

Websocket connection was closed: None
The channel "foobar" was unable to be joined. Check the channel is valid.
The channel "foobar2" was unable to be joined. Check the channel is valid.
Websocket connection was closed: None
The channel "foobar" was unable to be joined. Check the channel is valid.
The channel "foobar2" was unable to be joined. Check the channel is valid.
Websocket connection was closed: None
Websocket connection was closed: None
Task exception was never retrieved
future: <Task finished name='Task-14983' coro=<WSConnection._join_future_handle() done, defined at /home/eclip5e/.local/lib/python3.11/site-packages/twitchio/websocket.py:327> exception=KeyError('foobar')>
Traceback (most recent call last):
  File "/usr/lib/python3.11/asyncio/tasks.py", line 490, in wait_for
    return fut.result()
           ^^^^^^^^^^^^
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/eclip5e/.local/lib/python3.11/site-packages/twitchio/websocket.py", line 329, in _join_future_handle
    await asyncio.wait_for(fut, timeout=timeout)
  File "/usr/lib/python3.11/asyncio/tasks.py", line 492, in wait_for
    raise exceptions.TimeoutError() from exc
TimeoutError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/eclip5e/.local/lib/python3.11/site-packages/twitchio/websocket.py", line 332, in _join_future_handle
    self._join_pending.pop(channel)
KeyError: 'foobar'
Task exception was never retrieved
future: <Task finished name='Task-14982' coro=<WSConnection._join_future_handle() done, defined at /home/eclip5e/.local/lib/python3.11/site-packages/twitchio/websocket.py:327> exception=KeyError('foobar2')>
Traceback (most recent call last):
  File "/usr/lib/python3.11/asyncio/tasks.py", line 490, in wait_for
    return fut.result()
           ^^^^^^^^^^^^
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/eclip5e/.local/lib/python3.11/site-packages/twitchio/websocket.py", line 329, in _join_future_handle
    await asyncio.wait_for(fut, timeout=timeout)
  File "/usr/lib/python3.11/asyncio/tasks.py", line 492, in wait_for
    raise exceptions.TimeoutError() from exc
TimeoutError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/eclip5e/.local/lib/python3.11/site-packages/twitchio/websocket.py", line 332, in _join_future_handle
    self._join_pending.pop(channel)
KeyError: 'foobar2'
Websocket connection was closed: None
Websocket connection was closed: None
Websocket connection was closed: None
Websocket connection was closed: None
Websocket connection was closed: None
Websocket connection was closed: None
Websocket connection was closed: None
Websocket connection was closed: None
Websocket connection was closed: None
Websocket connection was closed: None
returnDanilo commented 5 months ago

I was able to investigate this further by inspecting the debug logs I enabled with:

import logging
logging.basicConfig(filename='chatbot_debug.log', level=logging.DEBUG)

chatbot_debug.log

My guess is the issue happens when twitch sends us a RECONNECT, which triggers the TwitchIO logic for handling the situation which, in turn, attempts to use an expired token to authenticate. (If you manage to successfully authenticate with the same token you got this error with, please speak up because my guess is likely wrong) And it will attempt to reconnect many more times, hence the Websocket connection was closed: None errors.

The behavior I observed is that twitch will keep you authenticated even if your token has expired, as long as you stay connected to the server.

As TwitchIO 2.x doesn't have the ability to handle refresh tokens (the matter is being worked on in 3.x), the issue here is bad exception handling: TwitchIO should not attempt to reconnect with the same failed credentials; it should raise a descriptive exception instead.

This suggests TwitchIO never had the ability to handle twitch's reconnect irc command correctly: it would always break once the command is received and the token is expired. Maybe this issue is only apparent now because twitch is sending the reconnect commands more often?

Related discussion: https://github.com/PythonistaGuild/TwitchIO/discussions/400

The log analysis was done on TwitchIO 2.9.0.

likel commented 5 months ago

I also confirm that I have updated to TwitchIO 2.9.0 and still receiving the same error

likel commented 5 months ago

@returnDanilo @Eclip5eLP May I ask where you are hosting your bot server?

chillymosh commented 5 months ago

To address what @returnDanilo said, in 3.x we won't even be using IRC so we will not have to handle manual task timeouts on IRC joins.

3.x will also have a complete token handler, oauth server built in for various web adapters, and much more.

This will (almost definitely) not be an issue going forward.

Twitch is severely limiting IRC connections soon, in an attempt to phase it out.

returnDanilo commented 5 months ago

Sure. It's up on GCP on the us-east1-b zone.

On Tue, Mar 19, 2024 at 6:32 AM Liam Kelly @.***> wrote:

@returnDanilo https://github.com/returnDanilo @Eclip5eLP https://github.com/Eclip5eLP May I ask where you are hosting your bot server?

— Reply to this email directly, view it on GitHub https://github.com/PythonistaGuild/TwitchIO/issues/438#issuecomment-2006501868, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMCBEFNEY2I2VDSHI762EWDYZAA2FAVCNFSM6AAAAABBNENOJWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBWGUYDCOBWHA . You are receiving this because you were mentioned.Message ID: @.***>

DennisMinn commented 3 months ago

Until a stable release of 3.x is out this is my attempt to handle the token refreshing. I call https://id.twitch.tv/oauth2/token at the start of the bot and continously refresh the token after the expiration period, minus some buffer time, passes. As mentioned by https://github.com/PythonistaGuild/TwitchIO/issues/438#issuecomment-2005206515, trying to reconnect with an expired access token is the issue. So I update the access token used in self._reconnect with self._connection._token = <new token> in my refresh_token() method. The code feels a bit out of place, so I wanted to ask how did you guys remedy the issue?

config = {
    'access_token': os.environ['ACCESS_TOKEN'],
    'refresh_token': os.environ['REFRESH_TOKEN'],
    'channel': os.environ['INITIAL_CHANNEL']
}

class Bot(commands.Bot):
    def __init__(self):
        super().__init__(token=config['access_token'], prefix='!', initial_channels=[config['channel']])

    async def event_ready(self):
        while True:
            expiration = await self.refresh_token()
            await asyncio.sleep(expiration - 1800)

    async def refresh_token(self):
        url = 'https://id.twitch.tv/oauth2/token'
        data = {
            'client_id': config['client_id'],
            'client_secret': config['client_secret'],
            'grant_type': 'refresh_token',
            'refresh_token': config['refresh_token']
        }
        headers = {
            'Content-Type': 'application/x-www-form-urlencoded'
        }

        response = requests.post(url, data=data, headers=headers)
        token = response.json()

        config['access_token'] = token['access_token']
        config['refresh_token'] = token['refresh_token']
        self._connection._token = token['access_token']

        return token['expires_in']