hashicorp / consul-template

Template rendering, notifier, and supervisor for @HashiCorp Consul and Vault data.
https://www.hashicorp.com/
Mozilla Public License 2.0
4.76k stars 782 forks source link

initial run returns empty data #2004

Open mendelskiv93 opened 3 weeks ago

mendelskiv93 commented 3 weeks ago

Version

v0.39.1

We are using consul-template in Exec mode:

ExecStart=/usr/local/bin/consul-template \
    -config "/var/lib/ssl-proxy-config/template.hcl" \
    -template "/var/lib/ssl-proxy-config/proxy.yaml.ctmpl:/var/lib/ssl-proxy-config/proxy.yaml:sudo nginx -s reload"

to generate Nginx configuration files dynamically based on service metadata stored in Consul. Our consul-template renders tagged services and generates upstream configurations.

Templating is triggered by changes in consul catalog and on each trigger, all ~200 available services are fetched and only ~130 tagged ones are rendered.

For a simplicity, I created a small environment with only 2 tagged services and here is the trace.log.

Problem

Found a bug while extending our template with plugin which triggers a python script for moving decommissioned sites .conffiles to a folder.

{{- plugin "/var/lib/ssl-proxy-config/archive.py" (printf "%q" $fqdnList) }}

Upon restarting the ssl-proxy service or running consul-template first time even through CLI:

consul-template -log-level trace -config "/var/lib/ssl-proxy-config/template.hcl" -template "/var/lib/ssl-proxy-config/proxy.yaml.ctmpl:/var/lib/ssl-proxy-config/proxy.yaml:sudo nginx -s reload" -wait=5s:10s

the initial fetch cycle renders with empty data.

Looking at the code:

    // Fire an initial run to parse all the templates and setup the first-pass
    // dependencies. This also forces any templates that have no dependencies to
    // be rendered immediately (since they are already renderable).
    log.Printf("[DEBUG] (runner) running initial templates")
    if err := r.Run(); err != nil {
        r.ErrCh <- err
        return
    }

and some other issues:

gaves ma an idea of a first-pass run but I cannot really figure out why is this happening when there is a successful response with all services:

2024-10-25T08:57:10.797Z [TRACE] catalog.services(@gc-us-central1-a): GET /v1/catalog/services?dc=gc-us-central1-a&stale=true&wait=1m0s
2024-10-25T08:57:10.804Z [TRACE] catalog.services(@do-ams3): returned 190 results
2024-10-25T08:57:10.804Z [TRACE] (view) catalog.services(@do-ams3) marking successful data response

before the first rendering:

2024-10-25T08:57:11.013Z [DEBUG] (runner) initiating run
2024-10-25T08:57:11.013Z [DEBUG] (runner) checking template 3700978ef4c53efbacef875a5b18923b
2024-10-25T08:57:11.063Z [DEBUG] (runner) rendering "/var/lib/ssl-proxy-config/proxy.yaml.ctmpl" => "/var/lib/ssl-proxy-config/proxy.yaml"
2024-10-25T08:57:11.070Z [INFO] (runner) rendered "/var/lib/ssl-proxy-config/proxy.yaml.ctmpl" => "/var/lib/ssl-proxy-config/proxy.yaml"
2024-10-25T08:57:11.071Z [DEBUG] (runner) appending command ["sudo nginx -s reload"] from "/var/lib/ssl-proxy-config/proxy.yaml.ctmpl" => "/var/lib/ssl-proxy-config/proxy.yaml"

Steps to reproduce

  1. Run a consul-template that fetches services from consul catalog
    consul-template -log-level trace -config "/var/lib/ssl-proxy-config/template.hcl" -template "/var/lib/ssl-proxy-config/proxy.yaml.ctmpl:/var/lib/ssl-proxy-config/proxy.yaml:sudo nginx -s reload" -wait=5s:10s
  2. Use some way of writing every consul-template rendering output to a new file using a python script or template itself

My results consistently show the same pattern: an initial empty file followed by a correct file rendered within the same run. After that, consul-template continues to run and render normally.

-rw-r--r-- 1 www-data adm    1 Nov  1 08:24 fqdn_list_20241101_082414.txt
-rw-r--r-- 1 www-data adm   39 Nov  1 08:24 fqdn_list_20241101_082415.txt
.
.
.
-rw-r--r-- 1 www-data adm   39 Nov  1 12:42 fqdn_list_20241101_124244.txt
-rw-r--r-- 1 www-data adm   39 Nov  1 12:43 fqdn_list_20241101_124335.txt

Also, using --wait seems to have no impact on initial run.

mendelskiv93 commented 2 weeks ago

the problem with that approach is that Consul Template is a two-pass implementation. The first time it reads the template to figure out what services/keys to watch, then it queries Consul. So having some kind of range/loop in the template is rather difficult with the current implementation. That's why its suggested having multiple Consul Template instances - one that queries the master services list and renders a template that is a ctmpl, and then another Consul Template instance that consumes that instance.

So I guess the question is: why is anything return from a first pass?