It happened once during a leadership change and it broke some of our templates.
We then tried and managed to reproduce the behaviour, although not consistently. We randomly and constantly kill all the consul servers, giving some seconds between kills so a leader can be elected before the cluster is killed again.
This is the command we run on our 3 consul servers:
while true; do sudo pkill consul$; sleep $(( $RANDOM % 8 + 2 )); done
My intuition says it could be related to not using the consistent mode in the GET to the API: https://www.consul.io/api/index.html#consistent, but I think that mode is not supported in consul-template.
Consul Template version
consul-template v0.19.4 (68b1da2)
Configuration
Command
Debug output
https://gist.github.com/sgirones/74eb8088e7a46833a9a3ed75295754f3
Expected behavior
GET /v1/catalog/nodes
should return 132 nodesActual behavior
GET /v1/catalog/nodes
returns 0 nodesSteps to reproduce
It happened once during a leadership change and it broke some of our templates. We then tried and managed to reproduce the behaviour, although not consistently. We randomly and constantly kill all the consul servers, giving some seconds between kills so a leader can be elected before the cluster is killed again.
This is the command we run on our 3 consul servers:
My intuition says it could be related to not using the consistent mode in the GET to the API: https://www.consul.io/api/index.html#consistent, but I think that mode is not supported in consul-template.