Wizcorp / frontrunner

Automated HAProxy reconfiguration for Marathon
https://github.com/Wizcorp/frontrunner
MIT License
57 stars 2 forks source link

When too many ops happen at once, I end up with more than one haproxy daemon process #10

Open stelcheck opened 10 years ago

stelcheck commented 10 years ago
node1.mesos1 | success | rc=0 >>
haproxy   4007  0.0  0.0  18924  1648 ?        Ss   09:05   0:00 /usr/sbin/haproxy -D -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf 3981
haproxy  25615  0.0  0.0  19100  1748 ?        Ss   10:43   0:00 /usr/sbin/haproxy -D -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf 25515

This seems to happen when many operations affecting configuration happen almost at the same times (within a second or two, sometimes a bit more)

stelcheck commented 10 years ago

Reopening, the PR did not solve the issue:

haproxy  13474  0.0  0.0  19072   824 ?        Ss   Jun08   5:50 /usr/sbin/haproxy -D -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf 13425
haproxy  13502  0.0  0.0  19104   812 ?        Ss   Jun15   2:48 /usr/sbin/haproxy -D -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf 13470
haproxy  14252  0.0  0.0  19068   788 ?        Ss   Jun02  10:41 /usr/sbin/haproxy -D -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf 14221
haproxy  19623  0.0  0.0  19068   784 ?        Ss   Jun02  10:44 /usr/sbin/haproxy -D -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf 19597
haproxy  20199  0.0  0.0  19088   804 ?        Ss   Jun11   5:53 /usr/sbin/haproxy -D -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf 20169
haproxy  20244  0.0  0.0  19088  1844 ?        Ss   Jun20   1:06 /usr/sbin/haproxy -D -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf 20187
haproxy  21511  0.0  0.0  19068   788 ?        Ss   Jun03  10:12 /usr/sbin/haproxy -D -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf 21485
haproxy  23099  0.0  0.0  19088  1852 ?        Ss   Jun22   0:10 /usr/sbin/haproxy -D -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf 23049
haproxy  26049  0.0  0.0  19068   796 ?        Ss   Jun01  11:25 /usr/sbin/haproxy -D -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf 26018
haproxy  27856  0.0  0.0  19068   788 ?        Ss   Jun05   9:04 /usr/sbin/haproxy -D -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf 27830
MiLk commented 10 years ago

I was able to reproduce the issue.

If I open a connection, and keep it open, then scale an app. The previous instance of HAProxy won't be closed immediately.

By reproducing this pattern multiple times, you can obtain several haproxy instances.

If one of your app have opened connection for a long time, you could obtain this.

Having multiple haproxy instances doesn't seem to be a problem.

There is a problem only if several instances are trying to listen on the same port. I don't exactly remember what was the first issue.

We can send manually SIGTTOU and SIGUSR1 to all the haproxy process before reloading to be sure that all processes receive it.

Next time you see this issue, could you check if there is an established connection for each running haproxy process.

stelcheck commented 10 years ago

It is very likely that there were.

MiLk commented 9 years ago

With haproxy -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf $(ps -C haproxy -o pid | tail -n +2 | tr "\\n" " ") as the reloadCommand it should send SIGTTOU and SIGUSR1 signals to every running HAProxy and not only the last one which has written its pid.

MiLk commented 9 years ago

See the commit message https://github.com/QubitProducts/bamboo/commit/3bf7fd5864af540fe760a558abfca5f509cd0e80

However, I'm not sure we can have more than one pid in /var/run/haproxy.pid.