hashicorp / terraform-aws-consul

A Terraform Module for how to run Consul on AWS using Terraform and Packer
Apache License 2.0
399 stars 482 forks source link

run-consul --version 1.7.1 fails to run due to default config #220

Open timalexander-inv opened 3 years ago

timalexander-inv commented 3 years ago

Am updating out base packer images for a Vault/Consul implementation we have. We use this repo to install and run consul as well as vault during the image create and hand off to terraform to spin up the images.

It looks like since our last run of the images back in January we are now getting an issue whereby consul fails to start as it does not like the new default config:

  "telemetry": {
    "disable_compat_1.9": true
  },
  "ui_config": {
    "enabled": $ui_config_enabled
  }
}

The command we run in our packer build is:

          "git clone --branch {{user `consul_module_version`}} https://github.com/hashicorp/terraform-aws-consul.git /tmp/terraform-aws-consul",
          "sudo /tmp/terraform-aws-consul/modules/install-consul/install-consul --version {{user `consul_version`}};"

With consul_version being set to 1.7.1 The user-data.sh we pass to the ASG images then runs the following on boot:

sudo /opt/consul/bin/run-consul --client --cluster-tag-key "${consul_cluster_tag_key}" --cluster-tag-value "${consul_cluster_tag_value}"

The error we now see in the logs is:

Apr 28 11:40:35 ip-xxx-xxx-xxx-xxx user-data: 2021-04-28 11:40:35 [INFO] [run-consul] Creating default Consul configuration
Apr 28 11:40:35 ip-xxx-xxx-xxx-xxx user-data: 2021-04-28 11:40:35 [INFO] [run-consul] Installing Consul config file in /opt/consul/config/default.json
Apr 28 11:40:35 ip-xxx-xxx-xxx-xxx user-data: 2021-04-28 11:40:35 [INFO] [run-consul] Creating systemd config file to run Consul in /etc/systemd/system/consul.service
Apr 28 11:40:35 ip-xxx-xxx-xxx-xxx user-data: 2021-04-28 11:40:35 [INFO] [run-consul] Reloading systemd config and starting Consul
Apr 28 11:40:35 ip-xxx-xxx-xxx-xxx systemd: Reloading.
Apr 28 11:40:35 ip-xxx-xxx-xxx-xxx user-data: Created symlink from /etc/systemd/system/multi-user.target.wants/consul.service to /etc/systemd/system/consul.service.
Apr 28 11:40:35 ip-xxx-xxx-xxx-xxx systemd: Reloading.
Apr 28 11:40:35 ip-xxx-xxx-xxx-xxx systemd: Starting "HashiCorp Consul - A service mesh solution"...
Apr 28 11:40:40 ip-xxx-xxx-xxx-xxx consul: ==> Error parsing /opt/consul/config/default.json: 2 errors occurred:
Apr 28 11:40:40 ip-xxx-xxx-xxx-xxx consul: * invalid config key telemetry.disable_compat_1.9
Apr 28 11:40:40 ip-xxx-xxx-xxx-xxx consul: * invalid config key ui_config
Apr 28 11:40:40 ip-xxx-xxx-xxx-xxx user-data: Job for consul.service failed because the control process exited with error code. See "systemctl status consul.service" and "journalctl -xe" for details.
Apr 28 11:40:40 ip-xxx-xxx-xxx-xxx systemd: consul.service: main process exited, code=exited, status=1/FAILURE

If I remove this config then the run-consul command succeeds. I can find no reference to this in the 1.7.1 release notes so suspect this is a new bit of config in 1.9.x that is not backwards compatible?

brikis98 commented 3 years ago

Ah, you're right, this may indeed have been a backwards incompatible change in https://github.com/hashicorp/terraform-aws-consul/releases/tag/v0.8.5 that we missed. I updated the release notes with a warning.