cloudsidedev / appside

Multitenant environment automation.
http://cloudside.ch
GNU Affero General Public License v3.0
38 stars 7 forks source link

atlantis service hangs #54

Closed ivomarino closed 6 years ago

ivomarino commented 7 years ago

it may happen from time to time that some services on atlantis freeze or hang, this issue is here for starting to debug such situations. So actually when atlantis freezes we should check this way:

at this point we can debug via ssh to see if some process is not running anymore:

ssh atlantis "ps aux | grep apache2"
root      7025  0.0  1.2 485748 26236 ?        Ss   16:46   0:00 /usr/sbin/apache2 -k start
www-data  7036  0.0  2.1 489536 44856 ?        S    16:46   0:00 /usr/sbin/apache2 -k start
www-data  7037  0.1  2.9 572452 59940 ?        S    16:46   0:03 /usr/sbin/apache2 -k start
www-data  7038  0.1  3.1 502332 64296 ?        S    16:46   0:04 /usr/sbin/apache2 -k start
www-data  7039  0.1  3.9 573760 81392 ?        S    16:46   0:04 /usr/sbin/apache2 -k start
www-data  7040  0.2  5.0 597660 103220 ?       S    16:46   0:06 /usr/sbin/apache2 -k start
www-data  7868  0.0  2.5 501964 51756 ?        S    16:50   0:01 /usr/sbin/apache2 -k start
www-data  7937  0.0  1.6 489208 32884 ?        S    16:50   0:00 /usr/sbin/apache2 -k start
www-data  7938  0.0  1.5 489520 32168 ?        S    16:50   0:01 /usr/sbin/apache2 -k start
www-data  9159  0.2  2.5 502324 52408 ?        S    16:55   0:05 /usr/sbin/apache2 -k start
eim      18602  0.0  0.0  12828   992 ?        Ss   17:29   0:00 grep apache2

ssh atlantis "ps aux | grep varnish"
root      2339  0.0  0.2 126900  5288 ?        Ss   16:45   0:00 /usr/sbin/varnishd -P /var/run/varnishd.pid -a :6081 -T localhost:60821 -f /etc/varnish/default.vcl -S /etc/varnish/secret -s malloc,256m
nobody    2340  0.1  0.9 292884 19884 ?        Sl   16:45   0:05 /usr/sbin/varnishd -P /var/run/varnishd.pid -a :6081 -T localhost:60821 -f /etc/varnish/default.vcl -S /etc/varnish/secret -s malloc,256m
eim      18612  0.0  0.0  12828   992 ?        Ss   17:29   0:00 grep varnish

ssh atlantis "ps aux | grep haproxy"
haproxy   1362  0.0  0.3  42196  7068 ?        Ss   16:45   0:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -D -p /var/run/haproxy.pid
eim      18623  0.0  0.0  12832   992 ?        Ss   17:29   0:00 grep haproxy

the basic idea behind this is to understand which component has crashed in order to start investigation.

grappler commented 7 years ago

ping atlantis, does the VM respond? yes ssh atlantis, possible to login via ssh? No open https://atlantis/lb, is the LB running? No open http://atlantis, are haproxy, varnish and apache2 running? no open http://atlantis:8080, is apache2 only running? no

ivomarino commented 7 years ago

@grappler status about this? thanks.

grappler commented 7 years ago

I had to restart appflow a few times today while working on Swisscom.

ping atlantis, does the VM respond? no ssh atlantis, possible to login via ssh? No open https://atlantis/lb, is the LB running? No open http://atlantis, are haproxy, varnish and apache2 running? no open http://atlantis:8080, is apache2 only running? no

ivomarino commented 7 years ago

Seems a memory hit limit to me, how much ram does OS X have?