hashicorp / consul-template

Template rendering, notifier, and supervisor for @HashiCorp Consul and Vault data.
https://www.hashicorp.com/
Mozilla Public License 2.0
4.76k stars 783 forks source link

CT Dedup Rendering Latency and Increased Raft Transactions. #1123

Open orarnon opened 6 years ago

orarnon commented 6 years ago

Consul Template version

0.19.5

Configuration

reload_signal = "SIGHUP"
kill_signal = "SIGTERM"
max_stale = "30s"
log_level = "info"
deduplicate {
  enabled = true
  prefix = "dedup/env/app"
}
{{$appenv := env "appenv" }}
environment = {{ env "appenv" }}
appname = rv-bidder
include "application"

{{range tree  (printf "consul-template/%s/apps/bidder/" $appenv)}}
{{.Key | replaceAll "/" "."}} = {{.Value}}{{end}}

{{range tree  (printf "consul-template/%s/apps/rv-bidder/" $appenv)}}
{{.Key | replaceAll "/" "."}} = {{.Value}}{{end}}

redis {
    advertisersWithExternalTracking {
        {{range ls  (printf "consul-template/%s/redis/advertisersWithExternalTracking/" $appenv)}}
        {{.Key}} = {{.Value}}{{end}}
        sentinelNodes = "{{range $i, $e := service  (printf "%s.redis-sentinel" $appenv) "any"}}{{if ne $i 0}},{{end}}{{$e.Node}}:{{$e.Port}}{{end}}"

Debug output

Provide a link to a GitHub Gist containing the complete debug output by running with -log-level=trace.

Expected behavior

Templates should be rendered when CT is loaded.

Actual behavior

We see that some config files are renderes only minutes after CT is spawned. There are no errors to indicate any issue. When the config file is rendered once without dedup, the file is created immediately. However, with dedup in place, there's no error and the file is not created. We have looked at the logs in trace mode and found nothing.

On one occasion, I've deleted the key which holds sessions by a specific service and restarted CT service - which resulted with a config file being rendered properly.

Moreover, we see an increase of Raft transactions since we have enabled dedup

orarnon commented 6 years ago

Adding even more information: we see several loops of keys in "is still needed" state. For instance:

2018/08/20 14:00:48.771613 [DEBUG] (runner) kv.get(consul-template/prod_us/kafka/clusters/online/clusterTag) is still needed
2018/08/20 14:00:48.771639 [DEBUG] (runner) kv.block(consul-template/prod_us/kafka/clusters/offline/clusterTag) is still needed

All keys exist but it seems CT is having problem producing them while dedup is enabled. When dedup is disabled, the same CTMPL is rendred within a second.

We have these issues when there's a new CTMPL and all keys are rendered and a lock is seized. From what we understand, a new CTMPL should act like there's no dedup (since it renders the entire CTMPL from scratch) and then add a lock phase. This DOES NOT WORK.

orarnon commented 5 years ago

Bump.

eikenb commented 5 years ago

Hello @orarnon, thanks for filing the issue and sorry for the long delay in a response.

Has anything changed since you reported this?

orarnon commented 5 years ago

Hi @eikenb , we have dropped CT dedup since it just didn't work for us.

eikenb commented 5 years ago

Glad things are working for you without dedup. The problem probably is rooted somewhere in the manager/runner.go code as that is kind of a mess. I plan on re-working it and I'll keep an eye out for what might be causing this as I do that.

orarnon commented 5 years ago

Hi @eikenb If you are re-working CT, I would suggest adding a feature for a white list of files to use. If you have several templated but only one will benefit from dedup, a white-list in CT configuration can help with this situation.

eikenb commented 5 years ago

@orarnon Please file an issue with the feature request. It will get lost here and it sounds like a good idea on the face of it.

orarnon commented 5 years ago

Hi @eikenb , I saw no official way or template for a feature request, should I just submit a simple issue?

eikenb commented 5 years ago

@orarnon Yes. Just a normal issue.

And I'll have to look through the contributing/template docs... seems like they should mention feature requests. I'll need to fix that if not. Thanks.

eikenb commented 5 years ago

Wait... is this the same whitelist feature you mentioned in https://github.com/hashicorp/consul-template/issues/1124? If so then no need for another one. Also, if so, is the whitelist the main part of that feature you want? There you mention both a whitelist and a blacklist.

orarnon commented 5 years ago

@eikenb yes it is, thanks! Blacklist or whitelist are two approaches to the same solution.