Open obscurerichard opened 5 years ago
This happened again today:
the server rebooted again:
```core@freezingsaddles ~ $ last | head
core pts/0 108.31.214.14 Wed Jan 9 14:12 still logged in
core ssh 108.31.214.14 Wed Jan 9 14:12 still logged in
reboot system boot 4.14.88-coreos Wed Jan 9 05:58 still running
core pts/0 108.31.214.14 Tue Jan 8 19:33 - 05:58 (10:24)
core ssh 108.31.214.14 Tue Jan 8 19:33 - 05:58 (10:24)
core pts/0 108.31.214.14 Mon Jan 7 04:46 - 05:52 (01:05)
core ssh 108.31.214.14 Mon Jan 7 04:46 - 05:52 (01:05)
core pts/0 108.31.214.14 Sun Jan 6 23:31 - 01:19 (01:47)
core ssh 108.31.214.14 Sun Jan 6 23:31 - 01:19 (01:47)
core pts/0 166.170.29.73 Sun Jan 6 21:57 - 22:27 (00:30)
core@freezingsaddles ~ $ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1a014da34df7 freezingsaddles/freezing-web:latest "/bin/sh -c 'gunicor…" 19 hours ago Up Less than a second 8000/tcp freezing-web
cf81462b819c freezingsaddles/freezing-sync:latest "/bin/sh -c freezing…" 2 days ago Up Less than a second freezing-sync
4754ed31e3c7 datadog/docker-dd-agent:latest "/entrypoint.sh supe…" 2 days ago Up 1 second (health: starting) 8125/udp, 8126/tcp dd-agent
08378824b8e4 jrcs/letsencrypt-nginx-proxy-companion "/bin/bash /app/entr…" 11 days ago Up Less than a second nginx-letsencrypt
88ac5a904d8d gliderlabs/logspout "/bin/logspout syslo…" 3 weeks ago Up 1 second 80/tcp logspout
757607a67b08 freezingsaddles/freezing-nq:latest "/bin/sh -c 'gunicor…" 9 months ago Up Less than a second 8000/tcp freezing-nq
feed948b4215 compose_beanstalkd "beanstalkd -p 11300…" 10 months ago Up Less than a second 11300/tcp beanstalkd
core@freezingsaddles ~ $ cd /opt/compose/
core@freezingsaddles /opt/compose $ docker-compose up -d
dd-agent is up-to-date
Starting nginx-docker-gen ...
beanstalkd is up-to-date
logspout is up-to-date
Starting nginx ...
nginx-letsencrypt is up-to-date
freezing-nq is up-to-date
freezing-web is up-to-date
Starting nginx ... done```
Now that we've moved the site to AWS, and added some swap to the CentOS 7 Lightsail instance, we have not had this problem in practice. Over the competition, we've also tweaked the docker-compose.yml
file a few times to find the optimal settings for the web server container running nginx
. I have not had to restart this manually once this season.
I spoke too soon:
Last login: Sat Mar 21 18:14:23 2020 from azathoth.bullington-mcguire.net
[centos@ip-172-26-7-85 ~]$ w
14:35:22 up 4:30, 1 user, load average: 3.19, 3.43, 3.68
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
centos pts/0 azathoth.bulling 14:35 2.00s 0.45s 0.13s w
[centos@ip-172-26-7-85 ~]$ free
total used free shared buff/cache available
Mem: 1013032 465964 177076 33812 369992 360864
Swap: 1048572 142336 906236
[centos@ip-172-26-7-85 ~]$ top
top - 14:35:39 up 4:30, 1 user, load average: 3.12, 3.41, 3.67
Tasks: 109 total, 4 running, 105 sleeping, 0 stopped, 0 zombie
%Cpu(s): 12.6 us, 0.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 86.6 st
KiB Mem : 1013032 total, 181488 free, 461240 used, 370304 buff/cache
KiB Swap: 1048572 total, 906236 free, 142336 used. 365624 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1085 root 20 0 777676 55216 7384 S 80.2 5.5 211:37.28 dockerd 1481 root 20 0 18424 10032 1432 R 10.5 1.0 31:56.98 logspout 2171 root 20 0 45340 16292 3712 S 5.4 1.6 0:55.70 process-agent 2173 root 20 0 177920 35064 2832 S 1.6 3.5 1:10.43 python 15848 centos 20 0 164108 2272 1576 R 0.3 0.2 0:00.17 top 1 root 20 0 54416 5372 2844 S 0.0 0.5 0:28.97 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 4 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 6 root 20 0 0 0 0 S 0.0 0.0 0:00.71 ksoftirqd/0 7 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 9 root 20 0 0 0 0 R 0.0 0.0 0:22.77 rcu_sched 10 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 lru-add-drain 11 root rt 0 0 0 0 S 0.0 0.0 0:01.67 watchdog/0 13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kdevtmpfs 14 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 netns 15 root 20 0 0 0 0 S 0.0 0.0 0:00.02 xenwatch 16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 xenbus 18 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khungtaskd 19 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 writeback 20 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kintegrityd 21 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 bioset 22 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 bioset 23 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 bioset 24 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kblockd 25 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 md 26 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 edac-poller 27 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 watchdogd 32 root 20 0 0 0 0 S 0.0 0.0 0:02.11 kswapd0 33 root 25 5 0 0 0 S 0.0 0.0 0:00.00 ksmd 34 root 39 19 0 0 0 S 0.0 0.0 0:00.02 khugepaged 35 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 crypto 43 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kthrotld 45 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kmpath_rdacd 46 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kaluad 47 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kpsmoused 49 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 ipv6_addrconf 62 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 deferwq [centos@ip-172-26-7-85 ~]$ cd /opt/compose
[centos@ip-172-26-7-85 compose]$ docer-compose logs > ~/log/hang.txt
-bash: docer-compose: command not found
[centos@ip-172-26-7-85 compose]$ docker-compose logs > ~/log/hang.txt
[centos@ip-172-26-7-85 compose]$ restart
WARNING: The ENVIRONMENT variable is not set. Defaulting to a blank string.
beanstalkd is up-to-date
logspout is up-to-date
dd-agent is up-to-date
nginx-letsencrypt is up-to-date
Starting nginx-docker-gen ...
Starting nginx ...
freezing-web is up-to-date
Starting nginx-docker-gen ... done
Starting nginx ... done
[centos@ip-172-26-7-85 compose]$ uptime
15:58:48 up 5:53, 1 user, load average: 4.55, 4.09, 3.97
[centos@ip-172-26-7-85 compose]$
This is going back in the icebox until we have this problem again.
Right now we have outstanding uptime:
[centos@ip-172-31-86-46 compose]$ uptime
22:17:26 up 715 days, 20:13, 1 user, load average: 0.00, 0.01, 0.05