Open ArthaTi opened 4 years ago
Weird, I've never run into this. It seems like an upstream Vibe.d issue: https://github.com/vibe-d/vibe.d/issues/2290 Is your bot by any chance hitting Discord's API ratelimits? How often are you sending messages?
If possible, could you use gdb
to debug the issue? run gdb ./your_bot_binary
then hit r
and try to trigger the issue.
This is quite weird though... I will try reproducing on my end.
Also note, I'm using a very old Vibe.d version (it was what Dscord originally used); I'll update to the latest in a bit, then try switching to the master branch (set your version as master
in the dub config file) and see if the issue is still reproducible.
I'm not sending messages too often, definitely not an amount to hit ratelimits — the bot responds to some of my occasional commands and otherwise only reads some of the messages on receive, that's all.
I'm now trying to debug this on the server... so far the issue didn't trigger in a surprisingly long time...
alright, thanks.
As @s-ludwig said in the upstream issue, this may be a keep-alive issue, I suppose.
My bot is only on one server (besides a test server) and the bug seems to trigger when there is no activity on it for a period of time. I've also noticed that it stops receiving new messages when this happens, and only starts doing so again when it is pinged. For example, the server went silent yesterday around 23:38 and the next activity was only around 23:50 — and then the bug occured.
I'll implement a workaround where the bot will send a heartbeat message to the test server once in 5 minutes, lets see if it works to confirm this theory.
If the event receiving fails too, then it seems that the keep-alive issue occurs both when sending the http requests and keeping the connection to the websocket alive?
@codic12 yeah, it would seem so. The interesting thing is that pinging the bot wakes up the bot, so the issue must only affect some of the APIs.
Also, since I've implemented the heartbeat message, the bot has been running 6 hours straight without issues. It must be idling what kills it.
Pinging the bot wakes it up? That's... weird... pings don't come from some separate event, they are just encoded in plaintext; eg if you typed @bot
it would translate to <@id>
in the message creation event. Nothing separate should happen on pings, so that's interesting
seemingly the bot lost connection tonight and crashed right after reconnecting with the same TLS write error, stopped reading messages and has switched status to offline, but continued to send heartbeat messages... this is odd.
2020-10-11T01:59:02.414 [info] client.d:145:handleReadyEvent Recieved READY payload, starting heartbeater
Task terminated with uncaught exception: Writing to TLS stream: error:80000001:lib(128):func(0):reason(1) (2147483649)
Task terminated with uncaught exception: Writing to TLS stream: error:1409F07F:SSL routines:ssl3_write_pending:bad write retry (336195711)
Task terminated with uncaught exception: Writing to TLS stream returned an error: non-recoverable socket I/O error: 0 (Success)
Task terminated with uncaught exception: Writing to TLS stream returned an error: non-recoverable socket I/O error: 0 (Success)
Task terminated with uncaught exception: Writing to TLS stream returned an error: non-recoverable socket I/O error: 0 (Success)
Task terminated with uncaught exception: Writing to TLS stream returned an error: non-recoverable socket I/O error: 0 (Success)
Task terminated with uncaught exception: Writing to TLS stream returned an error: non-recoverable socket I/O error: 0 (Success)
I made the source code open if you wanted to take a look https://github.com/Soaku/MoaiBot/
Now I realized the bot stopped reading messages a few hours earlier. I'm lost...
That wasn't supposed to close the issue, but I believe that should fix it, and if not, at the very least it will provide more details about the heartbeating issue. What architecture are you hosting this bot on? The sequence number, which I believe cause the issue if it continues, goes up to roughly 1029 in 12 hours ((3600/42)*12+1, the +1 is because of the identify payload), but a uint should be big enough for a 32-bit or 64-bit architecture; regardless that'll now be logged so we can see if it overflows
Oh, by the way, you'll need to change your version to master
for this; I'll push it up if it works. Also added that bot property on users that you wanted; User#bot
is a boolean now.
@codic12 Thanks! I'll try updating and see if it helps. Architecture is x86_64
.
Nice. I have no idea why github is closing and reopening issues, that's... quite interesting
The bot has been going down rather frequently lately, it looks like the heartbeats can also trigger the error.
2020-10-15T08:42:49.236 [info] client.d:319:heartbeat Trying to heartbeat, seq 1397
2020-10-15T08:42:49.236 [info] client.d:321:heartbeat Successfully sent heartbeat!
2020-10-15T08:42:54.734 [info] client.d:319:heartbeat Trying to heartbeat, seq 1397
Task terminated with uncaught exception: Writing to TLS stream: error:80000001:lib(128):func(0):reason(1) (2147483649)
2020-10-15T08:43:06.156 [info] client.d:319:heartbeat Trying to heartbeat, seq 1397
Task terminated with uncaught exception: Writing to TLS stream: error:1409F07F:SSL routines:ssl3_write_pending:bad write retry (336195711)
2020-10-15T08:43:08.840 [info] client.d:319:heartbeat Trying to heartbeat, seq 1397
Task terminated with uncaught exception: Writing to TLS stream returned an error: non-recoverable socket I/O error: 0 (Success)
2020-10-15T08:43:16.080 [info] client.d:319:heartbeat Trying to heartbeat, seq 1397
Task terminated with uncaught exception: Writing to TLS stream returned an error: non-recoverable socket I/O error: 0 (Success)
2020-10-15T08:43:21.082 [info] client.d:319:heartbeat Trying to heartbeat, seq 1397
Task terminated with uncaught exception: Writing to TLS stream returned an error: non-recoverable socket I/O error: 0 (Success)
2020-10-15T08:43:30.497 [info] client.d:319:heartbeat Trying to heartbeat, seq 1397
Task terminated with uncaught exception: Writing to TLS stream returned an error: non-recoverable socket I/O error: 0 (Success)
2020-10-15T08:43:30.509 [info] client.d:319:heartbeat Trying to heartbeat, seq 1397
Task terminated with uncaught exception: Writing to TLS stream returned an error: non-recoverable socket I/O error: 0 (Success)
Oh, so apparently the this.seq
on the GatewayClient overflows at 1397. I wonder why, but I will look into this further by doing some testing; but it makes sense now
I don't think it's that number in particular. Take a look at this instance of the bug:
2020-10-16T11:22:43.210 [info] client.d:319:heartbeat Trying to heartbeat, seq 1042 2020-10-16T11:22:43.210 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:22:43.210 [info] client.d:319:heartbeat Trying to heartbeat, seq 1042 2020-10-16T11:22:43.210 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:22:43.210 [info] client.d:319:heartbeat Trying to heartbeat, seq 1042 2020-10-16T11:22:43.210 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:22:43.210 [info] client.d:319:heartbeat Trying to heartbeat, seq 1042 2020-10-16T11:22:43.210 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:22:48.918 [info] client.d:319:heartbeat Trying to heartbeat, seq 1043 2020-10-16T11:22:48.918 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:22:52.092 [info] client.d:319:heartbeat Trying to heartbeat, seq 1045 2020-10-16T11:22:52.092 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:03.189 [info] client.d:319:heartbeat Trying to heartbeat, seq 1048 2020-10-16T11:23:03.189 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:06.926 [info] client.d:319:heartbeat Trying to heartbeat, seq 1048 2020-10-16T11:23:06.927 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:06.927 [info] client.d:319:heartbeat Trying to heartbeat, seq 1048 2020-10-16T11:23:06.927 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:15.472 [trace] client.d:124:requestJSON [POST] [200] https://discord.com/api/channels/764440222183194624/messages: {"content":"heartbeat","tts":false,"nonce":""} 2020-10-16T11:23:15.583 [info] client.d:319:heartbeat Trying to heartbeat, seq 1054 2020-10-16T11:23:15.583 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:17.919 [info] client.d:319:heartbeat Trying to heartbeat, seq 1055 2020-10-16T11:23:17.919 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:17.919 [info] client.d:319:heartbeat Trying to heartbeat, seq 1055 2020-10-16T11:23:17.919 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:24.467 [info] client.d:319:heartbeat Trying to heartbeat, seq 1055 2020-10-16T11:23:24.467 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:24.467 [info] client.d:319:heartbeat Trying to heartbeat, seq 1055 2020-10-16T11:23:24.468 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:24.468 [info] client.d:319:heartbeat Trying to heartbeat, seq 1055 2020-10-16T11:23:24.468 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:24.468 [info] client.d:319:heartbeat Trying to heartbeat, seq 1055 2020-10-16T11:23:24.468 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:24.468 [info] client.d:319:heartbeat Trying to heartbeat, seq 1055 2020-10-16T11:23:24.468 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:30.174 [info] client.d:319:heartbeat Trying to heartbeat, seq 1055 2020-10-16T11:23:30.174 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:33.345 [info] client.d:319:heartbeat Trying to heartbeat, seq 1057 2020-10-16T11:23:33.345 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:44.441 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 2020-10-16T11:23:44.441 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:48.181 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 2020-10-16T11:23:48.182 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:48.182 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 2020-10-16T11:23:48.182 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:56.842 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 2020-10-16T11:23:56.843 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:59.173 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 2020-10-16T11:23:59.173 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:23:59.173 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 2020-10-16T11:23:59.173 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:24:05.724 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 2020-10-16T11:24:05.725 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:24:05.725 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 2020-10-16T11:24:05.725 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:24:05.725 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 2020-10-16T11:24:05.725 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:24:05.725 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 2020-10-16T11:24:05.725 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:24:05.725 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 2020-10-16T11:24:05.725 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:24:11.425 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 2020-10-16T11:24:11.426 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:24:14.599 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 2020-10-16T11:24:14.600 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:24:25.703 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 2020-10-16T11:24:25.703 [info] client.d:321:heartbeat Successfully sent heartbeat! 2020-10-16T11:24:29.436 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 Task terminated with uncaught exception: Writing to TLS stream: error:80000001:lib(128):func(0):reason(1) (2147483649) 2020-10-16T11:24:29.447 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 Task terminated with uncaught exception: Writing to TLS stream: error:1409F07F:SSL routines:ssl3_write_pending:bad write retry (336195711) 2020-10-16T11:24:38.102 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 Task terminated with uncaught exception: Writing to TLS stream returned an error: non-recoverable socket I/O error: 0 (Success) 2020-10-16T11:24:40.424 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 Task terminated with uncaught exception: Writing to TLS stream returned an error: non-recoverable socket I/O error: 0 (Success) 2020-10-16T11:24:40.435 [info] client.d:319:heartbeat Trying to heartbeat, seq 1059 Task terminated with uncaught exception: Writing to TLS stream returned an error: non-recoverable socket I/O error: 0 (Success)
Writing to TLS stream: error:80000001:lib(128):func(0):reason(1) (2147483649)
Tried debugging this further, and I got: (Error reading from underlying stream: Task interrupted.)
So it seems like Discord issues a disconnect which is why the socket closes. Why? Not sure.
Sometimes when my bot sends a message, I get either of the errors below:
So far, I haven't got the issue when testing locally, but only once deployed to my server. It happens only sometimes and I haven't been able to track when.
The only other place where I have seen that error referenced is https://github.com/dlang/dlang-bot/issues/247. What connects that issue and mine is the VPS provider, Scaleway, so it might be an issue on their side. I could try moving my bot to an orangepi to see if it is true and if so, contact them.
In the meanwhile, I'll try retrying sending the request if such exception is thrown and see if it helps me.