Closed theADAMJR closed 2 years ago
It has happened again but at around 2AM today. I haven't tested the WS yet so I will do that today. There is a decent chance that fully unit testing the WS could uncover the bug. Also, if it's a hosting issue I could consider switching to Docker. I don't think it would be a hosting issue as I've had this on AWS and DigitalOcean with different apps.
Will upgrade to docker for deployment and development. Hopefully this will help address the issue.
It seems to be happening again. The hosting environment was not the problem, and implementing Docker did not fix this. I believe it's an infinite loop that gets executed when an obscure condition is met. The server won't be restarted for a while as AWS gives error 500's when I log try to log in. The best bet for finding the bug would be to look for changes around Nov 08. I don't currently have dev PC access, so that won't happen for a while. Last July, a similar problem caused the demise of one of my projects (3PG). I was unable to find the bug, despite months of investigation. The good news is that I don't intend on abandoning this project anytime soon, so the odds of a silly line of code ending accord.app are slim.
AWS (accord.app host) is currently experiencing difficulties and outages.
My development PC is back, so I will be able to update accord.app properly. I've added features to the accord ion test framework so I can use it mainly in accord.app API tests. I also have some cool features planned to add. The crashes have been less frequently than before, but this is due to slightly lesser user activity. I will continue to restart the server after it goes offline.
Describe the bug Something is causing the server to crash. After I updated channel settings (permissions) for an accord.app channel, the server stopped responding almost immediately. This means that it's probable that this is the cause.
To Reproduce Steps to reproduce the behavior:
Expected behavior No server crash.
Screenshots Updating channel perms for the second time.
Desktop (please complete the following information): Google Chrome, Version 95.0.4638.69 (Official Build) (64-bit).
Additional context: Note: could not reproduce the second time. It may be a coincidence that the server crashed after I changed channels - meaning there may be another cause. It's suspected that there is an infinite loop as there are large CPU spikes that make the AWS VPS unresponsive. It is also possible there could be a memory leak, but the intervals were inconsistent.
Backend VPS CPU usage.