Open legopost opened 6 years ago
Hello @legopost, thanks for taking the time to submit.
Unfortunately I am unable to reproduce the issue. While the number of connections does go up briefly when the HUP is received, it goes back down to normal levels over time as the blocking http calls timeout/finish. Using your basic config (modified to work with a test template) and script it outputs repeating instances of...
Thu 13 Jun 2019 03:46:45 PM PDT 3 8 Thu 13 Jun 2019 03:50:45 PM PDT 3 8 ...
Are you still seeing this issue? If so, maybe you could try to create a minimal test template to reproduce it as it would seem to have to be related to that.
Thanks.
@eikenb How did you configure your consul-template? The main point here is there should be connections between consul and consul-template, then we can monitor an aggregated hung connections. In @legopost report, there are 1784 config files. You don't need to config that many, but I guess 10+ config files is a good choice to reproduce and see an explicit connection hung.
In my system, I met the same issue, and the configuration files are 220 config files:
In a newly started consul-template, the statistic is as below:
$ echo -n "All connections: "; netstat -pan|grep 8500|wc -l; echo -n "ESTABLISHED: "; netstat -pan|grep 8500|grep -c ESTABLISHED; echo -n "WAIT: "; netstat -pan|grep 8500|grep -c WAIT; echo -n "LISTENING: "; netstat -pan|grep 8500|grep -c LISTEN; All connections: 249 ESTABLISHED: 240 WAIT: 8 LISTENING: 1
Now, reload consul-template (rather than restart), we can see the established connections are tripled:
reload() {
pid=cat /data/consul_template/pid
kill -HUP $pid
}
$ systemctl reload rda.consul-template $ echo -n "All connections: "; netstat -pan|grep 8500|wc -l; echo -n "ESTABLISHED: "; netstat -pan|grep 8500|grep -c ESTABLISHED; echo -n "WAIT: "; netstat -pan|grep 8500|grep -c WAIT; echo -n "LISTENING: "; netstat -pan|grep 8500|grep -c LISTEN; All connections: 804 ESTABLISHED: 640 WAIT: 163 LISTENING: 1
After a while, the WAIT connection can decrease to previous level, but the established connections are very stable to a doubled level: $ echo -n "All connections: "; netstat -pan|grep 8500|wc -l; echo -n "ESTABLISHED: "; netstat -pan|grep 8500|grep -c ESTABLISHED; echo -n "WAIT: "; netstat -pan|grep 8500|grep -c WAIT; echo -n "LISTENING: "; netstat -pan|grep 8500|grep -c LISTEN; All connections: 423 ESTABLISHED: 416 WAIT: 6 LISTENING: 1
I can run reload one more time to hung another 200+ connections very easily: $ echo -n "All connections: "; netstat -pan|grep 8500|wc -l; echo -n "ESTABLISHED: "; netstat -pan|grep 8500|grep -c ESTABLISHED; echo -n "WAIT: "; netstat -pan|grep 8500|grep -c WAIT; echo -n "LISTENING: "; netstat -pan|grep 8500|grep -c LISTEN; All connections: 1011 ESTABLISHED: 820 WAIT: 191 LISTENING: 1
$ echo -n "All connections: "; netstat -pan|grep 8500|wc -l; echo -n "ESTABLISHED: "; netstat -pan|grep 8500|grep -c ESTABLISHED; echo -n "WAIT: "; netstat -pan|grep 8500|grep -c WAIT; echo -n "LISTENING: "; netstat -pan|grep 8500|grep -c LISTEN; All connections: 661 ESTABLISHED: 648 WAIT: 12 LISTENING: 1
240 -> 416 -> 648. I believe the hung process is clear now.
Consul Template version
consul-template v0.19.4 (68b1da2)
Configuration
Command
Expected behavior
The number of sockets (file descriptors) used by the consul-template must remain the same after receiving SIGHUP.
Actual behavior
The number of sockets (file descriptors) used by the consul-template increases irreversible after each SIGHUP. And sooner or later leads to run out of all ephemeral ports on the local node.
Steps to reproduce
References