Closed sttts closed 10 years ago
i'm marking this as duplicate #30, we are working on a fix for this.
Though this has nothing to do with multiple concurrent reloads. It even happens with only a single one.
I used docker-enter (https://github.com/Pithikos/docker-enter) and called the command manually for testing.
I suspect it might actually be the same issue as #30, it's just that you struck the root cause for it, and the multiple concurrent updates simply cause an acute case of what you're describing.
I think so. The
Though the problem in #30 might be connected because concurrent reloads might create race conditions with the /var/run/haproxy.pid file. One has to wait for all those PIDs in there to actually exit until another reload can be started. This might also create multiple processes, though no zombies, but real running haproxy. On the other hand, I guess the processes to the later situation will probably immediately die because they cannot bind to the port 80/443.
Until we solve this issue, I'll accept it as a likely scenario that, if the way we call reload creates a zombie, calling reload n times creates n zombies. At any rate, this issue will confound any work on #30.
At 889 one of our mesos servers started with OOMs. So yes, they add up with every reload.
My test cluster is working without any zombie for 2 days now, with plenty of deployments in Marathon.
Every reload cmd leaves the previous haproxy process as zombie. Because neither bamboo nor docker waits for them, they don't go away.
In a real unix system (if bamboo is installed as package) init is waiting for the exited processes and this doesn't happen.