Open Simon-L opened 6 years ago
I have made some more experiments with this package.
Indeed it seems the software is crashing and the container restarts repeatedly, sometimes after running for 3 minutes, sometimes after 20 minutes.
I am also noticing it doesn't always crash after the errors reported in the previous message, it might happen seemingly randomly. The first line is always:
Killed
which makes me think it is hanging at some point and also puts me in an uncomfortable situation where I have absolutely no idea where to look in the code!
I'm using docker-compose but I can confirm these instability issues also happen on bare metal Debian.
I'm experiencing similar issues at ssb.learningsocieties.org
, which is also running at DigitalOcean on a standalone droplet.
Every once in a while I check whether the server is still up in practice, with varying results. The statistics paint a good picture:
Sometimes the server restarts and resumes correct behaviour, but sometimes the bandwidth drops to 0 and CPU/memory usage spike for extended periods of time. The logged errors are similar.
I don't really have time to troubleshoot it properly, sadly... For the time being I'll attempt to keep the pub up and running as best I can.
I've moved over to dinosaur's image last night with a separate webserver docker image. I also upgraded to 2GB RAM, just to be sure. The statistics look a lot healthier.
Not sure why that is exactly; perhaps the healer container makes the difference?
For now I've generated an invite that allows 1000 additional people to join, which should practically do the same as the previous auto-generation. Plus, once the server gets that size it'd be time to lock down anyhow...
I've had problems with dinosaur's image as well as easy-ssb-pub's image, but I noticed it's probably my 3-hop social graph is huge for the server. Configuring hops as 1 helped (with dinosaur's image), the server is pretty stable now. I could try hops=2 as well.
Just 2 cents if anyone is looking to improve stability.
I've tried to track this and document as best as possible.
As of now, cloning this repo and running the docker-compose commands on a fresh DigitalOcean server results in unwanted behaviour.
This is the log from the first start, the error shown here appears a few minutes after it has started.
Various other errors are reported:
Or (with timestamps):
These logs were obtained using
docker logs
, the first time I ran it after running the docker-compose command for the first time the command stopped on these lines:Looks like it's killed either by the docker daemon or by npm inside the container, it will then just reboot.
I've made sure to run these tests on a fresh install. I have personnally run the same docker image on my server and, added to those reported here, had several other issues probably related to #27.
I have taken note of the maintenance and status warning in the README, for the time being I think this repo works for many people and figured I should share my investigation!