Closed deprale closed 4 months ago
Do you happen to have any blocking code that could run at the same time as the query is being sent?
For example:
@bot.on("clientmoved")
async def test_1(bot: TSBot, ctx: TSCtx):
print("senging")
# ignore the direct call to _send(), this makes it easier to reproduce the issue
await bot._send(query("sendtextmessage").params(targetmode=3, msg="asd").compile())
@bot.on("clientmoved")
async def test_2(bot: TSBot, ctx: TSCtx):
print("sleeping")
time.sleep(5) # time.sleep is a blocking call
print("done")
In this example when a client moves, test_1
and test_2
are called concurrently. test_1
sends a query and awaits it.
meanwhile test_2
calls a blocking function that wont allow the event loop to run. When the blocking function returns, the event loop starts running again. But now time has passed and inside the _send()
call await asyncio.wait_for
sees that it has passed its 2.0
seconds timeout period and raises CancelledError.
To avoid this, all blocking functions should run in a separate thread (via asyncio.to_thread()
) or have async alternative version of such calls.
Scratch above. Did some more digging:
DEBUG:tsbot.default_plugins.keepalive:Sending keep-alive
DEBUG:tsbot.bot:Sending query: 'version'
DEBUG:asyncssh:[conn=0] Requesting key exchange
DEBUG:tsbot.events.handler:Got event: TSEvent(event='send', ctx={'query': 'version'})
DEBUG:asyncssh:[conn=0] Received key exchange request
DEBUG:asyncssh:[conn=0] Beginning key exchange
INFO:asyncssh:[conn=0] Connection lost
INFO:asyncssh:[conn=0, chan=0] Closing channel due to connection close
INFO:asyncssh:[conn=0, chan=0] Channel closed: Connection lost
It seems there is a condition where the KeepAlive
plugin sends a keep-alive query to the server while the ssh key exchange is going on. This would sever the connection and since no connection, the bot would close.
Need to do futher investigation on this.
Further investigation:
Inside connection.py
, configuring low key exchange period when creating the ssh connection helps to reproduce this bug.
self._connection = await asyncssh.connect(
self.address,
port=self.port,
username=self.username,
password=self.password,
known_hosts=None,
rekey_seconds=10, # Low key exchange
)
This forces frequent key exchanges by the client. When the bot tries to send a query, the ssh client tries to initiate KEX and fails, killing the connection. I dont think this use to be an issue few years ago, maybe something changed upstream (AsyncSSH)
After testing with other ssh clients, it seems like TeamSpeak SSH implementation doesn't handle rekeying the connection very well.
The problem with AsyncSSH
is that the default rekey_seconds
is set 1 hour.
After an hour, key exchange would happen and kill the connection when sending data.
self._connection = await asyncssh.connect(
self.address,
port=self.port,
username=self.username,
password=self.password,
known_hosts=None,
rekey_seconds=False, # Don't rekey after a given time
rekey_bytes="1000g", # High number, rekey after 1000GB of data sent
)
I dont't think there is a way to disable rekeying after the bytelimit is hit. This solution is not desirable, because of the bytes limit. it might not hit the limit soon but at somepoint.
Subclass AsyncSSH.SSHConnection to patch out rekeying.
Truly seems like a ghetto fix, but nonetheless can also just use systemd to restart the script I guess for now, along with the 1000g rekeying.
Maybe it's worthy to report on TS3 Forums?
This seems like a TeamSpeak bug to me. I will try to ask around in the TeamSpeak community forums once i get to it.
AsyncSSH
(or any other client i've tried) is not the problem here since it's complying with SSH standards. (rfc4253#section-9, rekey after 1h, 1GB data)
Downgrading to AsyncSSH 2.14.0 seems to fix the issue with rekeying.
pip install asyncssh==2.14.0
Going to push a patch pinning asyncssh version.
AsyncSSH versions affected:
I havent yet tried to find why those versions causes issues. Interestingly enough other SSH clients also fail rekeying with the Teamspeak server. (PuTTy, Bitvise SSH client), although OpenSSH client works great.
Randomly happens after a certain time has passed when bot is just idling on the server. I'm still using 0.18.0 but I see no changes in the functions mentioned in the error log.