hashicorp / consul-template

Template rendering, notifier, and supervisor for @HashiCorp Consul and Vault data.
https://www.hashicorp.com/
Mozilla Public License 2.0
4.76k stars 781 forks source link

Unnecessary rendering all templates whem template.wait is set #1140

Open marqc opened 6 years ago

marqc commented 6 years ago

Hi, I encountered some bug/place for improvement on template wait timer. It is desired to prevent flooding changes to disk/notified program when they occur to often in consul.

Current behavior is that for template with wait defined it is added to quescienceMap, and with every tick this template is rendered in memory, event though nothing changes in consul.

There is a place to improve this (runner.go) to append new quescience to quescienceMap only on consul event/change received, and then remove entry from this map, when template gets rendered. Also Run method triggered from quescience tick should only generate tpls currently present in quescienceMap.

Consul Template version

0.19.4

Configuration

template {
  source = "/tpls/conf.tpl"
  destination = "/output-file.tpl"
  command = "/bin/sh -c '/usr/bin/killall -SIGHUP myapp'"
  backup = false
  wait {
    min = "2s"
    max = "10s"
  }
}
{{range services}}{{$name := .Name}}
{{- range service $name}}
{{.Address}} {{.Name}} {{.Name}}.{{ env "DOMAIN"}} {{.Address | replaceAll "." "-"}}.{{.Name}}  {{.Address | replaceAll "." "-"}}.{{.Name}}.{{ env "DOMAIN"}}
{{- end }}

Command

/usr/local/bin/consul-template -config=/ctconfig -consul-addr=$CONSUL_ADDRESS -log-level=debug

Debug output

...
2018/09/17 12:54:51.783305 [DEBUG] (runner) received template "5eca8f5577257878e8b533b6f7b6a5ca" from quiescence
2018/09/17 12:54:51.783343 [INFO] (runner) initiating run
2018/09/17 12:54:51.783358 [DEBUG] (runner) checking template 5eca8f5577257878e8b533b6f7b6a5ca
2018/09/17 12:54:51.801431 [DEBUG] (runner) rendering "/tpls/conf.tpl" => "/etc/dnsmasq.d/ns.default.conf"
2018/09/17 12:54:51.801543 [DEBUG] (runner) checking template 664e869a6fc27466226da186f9bfd4cf
2018/09/17 12:54:51.810525 [DEBUG] (runner) diffing and updating dependencies
2018/09/17 12:54:51.812603 [DEBUG] (runner) enabling template-specific quiescence for "5eca8f5577257878e8b533b6f7b6a5ca"
2018/09/17 12:54:51.812613 [WARN] (runner) watching 290 dependencies - watching this many dependencies could DDoS your consul cluster
2018/09/17 12:54:53.810727 [DEBUG] (runner) received template "664e869a6fc27466226da186f9bfd4cf" from quiescence
2018/09/17 12:54:53.810783 [INFO] (runner) initiating run
2018/09/17 12:54:53.810814 [DEBUG] (runner) checking template 5eca8f5577257878e8b533b6f7b6a5ca
2018/09/17 12:54:53.834651 [DEBUG] (runner) checking template 664e869a6fc27466226da186f9bfd4cf
2018/09/17 12:54:53.843978 [DEBUG] (runner) rendering "/tpls/clusterhosts.tpl" => "/etc/hostsdir/hosts.default"
2018/09/17 12:54:53.844074 [DEBUG] (runner) diffing and updating dependencies
2018/09/17 12:54:53.845802 [DEBUG] (runner) enabling template-specific quiescence for "664e869a6fc27466226da186f9bfd4cf"
2018/09/17 12:54:53.845809 [WARN] (runner) watching 290 dependencies - watching this many dependencies could DDoS your consul cluster
2018/09/17 12:54:55.834815 [DEBUG] (runner) received template "5eca8f5577257878e8b533b6f7b6a5ca" from quiescence
2018/09/17 12:54:55.834858 [INFO] (runner) initiating run
2018/09/17 12:54:55.834883 [DEBUG] (runner) checking template 5eca8f5577257878e8b533b6f7b6a5ca
2018/09/17 12:54:55.855534 [DEBUG] (runner) rendering "/tpls/conf.tpl" => "/etc/dnsmasq.d/ns.default.conf"
2018/09/17 12:54:55.855635 [DEBUG] (runner) checking template 664e869a6fc27466226da186f9bfd4cf
2018/09/17 12:54:55.868029 [DEBUG] (runner) diffing and updating dependencies
2018/09/17 12:54:55.873420 [DEBUG] (runner) enabling template-specific quiescence for "5eca8f5577257878e8b533b6f7b6a5ca"
2018/09/17 12:54:55.873428 [WARN] (runner) watching 290 dependencies - watching this many dependencies could DDoS your consul cluster

Expected behavior

Rendering with timer only happens since consul watch change detected until actual changes gets written to file. When nothing changes in cluster ticks do nothing and CPU is resting.

Actual behavior

Template is rendered (in memory) every 2 secs, even when nothing changes in consul. CPU is working.

Steps to reproduce

Everytime with above config.

References

sodabrew commented 6 years ago

I suspect this is related to the work in #988 that I tracked down as the source of issues in https://github.com/hashicorp/consul-template/issues/1043#issuecomment-435334094

skyrocknroll commented 5 years ago

@marqc Do you have working patch ?

pierresouchay commented 5 years ago

@skyrocknroll http://github.com/criteo/consul-templaterb/ has those features too (but working hopefully) :-)

pearkes commented 5 years ago

I talk a bit about this in #1174. I believe this is how the runner loop is currently designed -- fortunately it isn't writing to disk at this interval but it is causing a good amount of seemingly unnecessary work.

I believe there are good reasons it is set-up this way -- notably the way dependencies are resolved -- so I don't want to suggest this isn't the right trade-off currently.

I think as we continue to evaluate consul-template's design in the longer term we might want to optimize for not causing this. I'm also open to hearing more feedback or suggestions here if there is a better way!

sodabrew commented 5 years ago

@pearkes would it make sense to try to dig into this issue now, after you’ve landed 1174, or would you be able to cut a 0.19.6 release soon and leave this issue for a subsequent micro?

pearkes commented 5 years ago

Some good discussion in https://github.com/hashicorp/consul-template/issues/1195 which was a duplicate of this.

@sodabrew I think it'd be great to fix this but as I noted above I fear it not being straight-forward given how the system is currently set-up. It seems like we could definitely better utilize blocking queries to do this more efficiently.

eikenb commented 5 years ago

Note that this is also a duplicate of #998 which matches the behavior with the addition that in it they noticed due to timestamps being included which triggered a file write every time.

eikenb commented 5 years ago

References in other issues that might be of interest/relevant while working on this...

https://github.com/hashicorp/consul-template/issues/1043#issuecomment-435334094 https://github.com/hashicorp/consul-template/issues/1196#issuecomment-507227187

eikenb commented 5 years ago

I cannot reproduce this using the provided config/template. So far the only way I can replicate this is to use 2 templates like in #998 and #1195.

eikenb commented 5 years ago

Just noticed that the debug output shows 2 templates where the provided config only has 1. So the same issue, just the provided config is incomplete.

eikenb commented 5 years ago

After spending most of the day tracing through the runner code I'm almost convinced that this is going to have to wait for a later release after some refactoring to get the code into a state more amiable to the work.