Open jinksunk opened 5 years ago
Did this ever work on older versions of consul-template? Which ones? Try both versions 0.19.5 and 0.20.0 since they're the most recent releases.
Hey @jinksunk, thanks for taking the time to report this.
Without a way to reproduce it I'm not sure I'll be able to do much about this other than keep an eye out for it. I think the eventual solution will be adding a service monitoring for consul-template as is discussed in your referenced ticket (#570).
I have seen this too. Also with no deterministic way to reproduce. :-(
Thanks for chiming in @liebman, if you ever get more information on this please let us know.
I'll continue to keep an eye out for what might be causing thisand will update here if I find anything.
Consul Template version
Config
Command
Debug output
We have not caught a similar circumstance with debug/trace logging turned on; here are the info logs showing that the regular (usually every 5 minutes) canary job stopped running, but after a key update, the other renderer worked as expected:
Expected behavior
What should have happened? In lieu of a health-check endpoint for the server, we setup a canary template that pulls a timestamp from consul and renders it to be scraped by prometheus to ensure that the consul-template service is still running as expected:
We expected that if one renderer (ie. the timestamp canary) faulted / stopped rendering, that would be an indication that the process as a whole had failed.
Actual behavior
What actually happened?
This weekend, the consul-template process remained up, and continued to render other templates (ie.
/opt/app/prometheus/var/prometheus-rules.yml
) and produce log messages, but no longer ran the timestamp update template.Steps to reproduce
While we have not yet found a deterministic process to reproduce this scenario, our concern is that our approach to monitoring the consul-template service is not as effective as we thought.
References
570