tedztar / mcstatusbot

discrod bot for server status
MIT License
98 stars 93 forks source link

Partial outage from Apr 11 5PM to Apr 13 12PM EST #209

Closed RahulR100 closed 6 months ago

RahulR100 commented 6 months ago

Hello folks!

We are currently investigating an outage that took place from April 11 (starting at around 5 pm eastern) that caused some shards of the bot to become unresponsive over time. We believe right now that this may be due to the automatic re clustering having failed thus locking the shards up at their maximum capacity. The bot has now been restarted with normal operations. We are going to look at the underlying code to see if there is something on our end that might have triggered this.

Apologies for any inconvenience and again, thanks for your support!

RahulR100 commented 6 months ago

As a short term fix, a heartbeat system has been added to make sure that the bot does not go offline for extended periods of time. As for a longer term fix the issue currently seems like the discord websocket closed but was not reopened for some reason. The investigation continues but I will close this issue for now!