theADAMJR / acrd.app

Chat app like old-style Discord, with custom themes and more.
https://acrd.app/
Mozilla Public License 2.0
246 stars 111 forks source link

[BUG] Server keeps crashing at random intervals #34

Closed theADAMJR closed 2 years ago

theADAMJR commented 2 years ago

Describe the bug Something is causing the server to crash. After I updated channel settings (permissions) for an accord.app channel, the server stopped responding almost immediately. This means that it's probable that this is the cause.

To Reproduce Steps to reproduce the behavior:

  1. Go to channel settings (text channel)
  2. Click on 'Perms'
  3. Toggle permissions
  4. Save changes
  5. Cya server

Expected behavior No server crash.

Screenshots image Updating channel perms for the second time.

Desktop (please complete the following information): Google Chrome, Version 95.0.4638.69 (Official Build) (64-bit).

Additional context: Note: could not reproduce the second time. It may be a coincidence that the server crashed after I changed channels - meaning there may be another cause. It's suspected that there is an infinite loop as there are large CPU spikes that make the AWS VPS unresponsive. It is also possible there could be a memory leak, but the intervals were inconsistent.

image Backend VPS CPU usage.

theADAMJR commented 2 years ago

It has happened again but at around 2AM today. I haven't tested the WS yet so I will do that today. There is a decent chance that fully unit testing the WS could uncover the bug. Also, if it's a hosting issue I could consider switching to Docker. I don't think it would be a hosting issue as I've had this on AWS and DigitalOcean with different apps.

theADAMJR commented 2 years ago

Will upgrade to docker for deployment and development. Hopefully this will help address the issue.

theADAMJR commented 2 years ago

34 Using containerized deployment (Docker) for accord.app. The idea is that Docker would restart the server when something goes wrong, and should be an overall more secure and stable environment.

theADAMJR commented 2 years ago

It seems to be happening again. The hosting environment was not the problem, and implementing Docker did not fix this. I believe it's an infinite loop that gets executed when an obscure condition is met. The server won't be restarted for a while as AWS gives error 500's when I log try to log in. The best bet for finding the bug would be to look for changes around Nov 08. I don't currently have dev PC access, so that won't happen for a while. Last July, a similar problem caused the demise of one of my projects (3PG). I was unable to find the bug, despite months of investigation. The good news is that I don't intend on abandoning this project anytime soon, so the odds of a silly line of code ending accord.app are slim.

theADAMJR commented 2 years ago

AWS (accord.app host) is currently experiencing difficulties and outages.

theADAMJR commented 2 years ago

My development PC is back, so I will be able to update accord.app properly. I've added features to the accord ion test framework so I can use it mainly in accord.app API tests. I also have some cool features planned to add. The crashes have been less frequently than before, but this is due to slightly lesser user activity. I will continue to restart the server after it goes offline.