hashicorp / consul-template

Template rendering, notifier, and supervisor for @HashiCorp Consul and Vault data.
https://www.hashicorp.com/
Mozilla Public License 2.0
4.76k stars 781 forks source link

Certificate Verification Cannot Be Disabled And Does Not Respect 'ca_cert' parameter. #965

Closed mskeefe closed 7 years ago

mskeefe commented 7 years ago

Consul Template version

consul-template v0.18.5 (9902dd5)

Configuration

...
  "consul": {
    "address": "consul:8600",

    "ssl": {
      "ca_cert" : "/opt/inst/dev/ssl/ca/rootCA.pem",
      "enabled": true,
      "verify": false
    },

    "retry": {
      "enabled": true,
      "attempts": 0,
      "backoff": "250ms",
      "max_backoff": "20s"
    }
  }
...

Debug output

2017/06/20 19:01:07.107223 [INFO] consul-template v0.18.5 (9902dd5)
2017/06/20 19:01:07.107234 [INFO] (runner) creating new runner (dry: false, once: false)
2017/06/20 19:01:07.107405 [DEBUG] (runner) final config: {"Consul":{"Address":"consul:8600","Auth":{"Enabled":false,"Username":"","Password":""},"Retry":{"Attempts":0,"Backoff":250000000,"MaxBackoff":20000000000,"Enabled":true},"SSL":{"CaCert":"/opt/inst/dev/ssl/ca/rootCA.pem","CaPath":"","Cert":"","Enabled":true,"Key":"","ServerName":"","Verify":true},"Token":"","Transport":{"DialKeepAlive":30000000000,"DialTimeout":30000000000,"DisableKeepAlives":false,"IdleConnTimeout":90000000000,"MaxIdleConns":100,"MaxIdleConnsPerHost":9,"TLSHandshakeTimeout":10000000000}},"Dedup":{"Enabled":true,"MaxStale":2000000000,"Prefix":"inst/dev/consul-template/dedup/nginx","TTL":15000000000},"Exec":{"Command":"","Enabled":false,"Env":{"Blacklist":[],"Custom":[],"Pristine":false,"Whitelist":[]},"KillSignal":2,"KillTimeout":30000000000,"ReloadSignal":null,"Splay":0,"Timeout":0},"KillSignal":2,"LogLevel":"trace","MaxStale":300000000000,"PidFile":"","ReloadSignal":1,"Syslog":{"Enabled":false,"Facility":"LOCAL0"},"Templates":[{"Backup":true,"Command":"nginx -s reload","CommandTimeout":30000000000,"Contents":"","Destination":"/etc/nginx/nginx.conf","Exec":{"Command":"nginx -s reload","Enabled":true,"Env":{"Blacklist":[],"Custom":[],"Pristine":false,"Whitelist":[]},"KillSignal":2,"KillTimeout":30000000000,"ReloadSignal":null,"Splay":0,"Timeout":30000000000},"Perms":420,"Source":"/opt/inst/consul-template/templates/nginx.ctmpl","Wait":{"Enabled":false,"Min":0,"Max":0},"LeftDelim":"","RightDelim":""}],"Vault":{"Address":"https://vault:8200","Enabled":true,"RenewToken":true,"Retry":{"Attempts":0,"Backoff":250000000,"MaxBackoff":20000000000,"Enabled":true},"SSL":{"CaCert":"/opt/inst/dev/ssl/ca/rootCA.pem","CaPath":"","Cert":"/opt/inst/vault/auth/client.crt","Enabled":true,"Key":"/opt/inst/vault/auth/client.key","ServerName":"","Verify":true},"Transport":{"DialKeepAlive":30000000000,"DialTimeout":30000000000,"DisableKeepAlives":false,"IdleConnTimeout":90000000000,"MaxIdleConns":100,"MaxIdleConnsPerHost":9,"TLSHandshakeTimeout":10000000000},"UnwrapToken":false},"Wait":{"Enabled":true,"Min":5000000000,"Max":10000000000}}
2017/06/20 19:01:07.107920 [INFO] (runner) creating watcher
2017/06/20 19:01:07.107995 [INFO] (runner) starting
2017/06/20 19:01:07.108004 [INFO] (dedup) starting de-duplication manager
2017/06/20 19:01:07.108013 [DEBUG] (runner) running initial templates
2017/06/20 19:01:07.108035 [INFO] (runner) initiating run
2017/06/20 19:01:07.108048 [DEBUG] (runner) checking template 2791a40288346db584903ad8f5c7456a
2017/06/20 19:01:07.108051 [INFO] (dedup) attempting to create session
2017/06/20 19:01:07.108074 [INFO] (dedup) starting watch for template hash 2791a40288346db584903ad8f5c7456a
2017/06/20 19:01:07.108086 [INFO] (dedup) listing data for template hash 2791a40288346db584903ad8f5c7456a
2017/06/20 19:01:07.108544 [INFO] (dedup) starting watch for template hash 0ba3032c39782b08814514e18a0786b7
2017/06/20 19:01:07.108552 [INFO] (dedup) listing data for template hash 0ba3032c39782b08814514e18a0786b7
2017/06/20 19:01:07.108849 [DEBUG] (runner) was not watching 10 dependencies
2017/06/20 19:01:07.108859 [DEBUG] (runner) checking template 0ba3032c39782b08814514e18a0786b7
2017/06/20 19:01:07.109013 [DEBUG] (runner) was not watching 1 dependencies
2017/06/20 19:01:07.109021 [DEBUG] (runner) diffing and updating dependencies
2017/06/20 19:01:07.109026 [DEBUG] (runner) enabling global quiescence for "2791a40288346db584903ad8f5c7456a"
2017/06/20 19:01:07.109030 [DEBUG] (runner) enabling global quiescence for "0ba3032c39782b08814514e18a0786b7"
2017/06/20 19:01:07.109034 [DEBUG] (runner) watching 0 dependencies
2017/06/20 19:01:07.121057 [ERR] (dedup) failed to get 'inst/dev/consul-template/dedup/nginx/2791a40288346db584903ad8f5c7456a/data': Get https://consul:8600/v1/kv/inst/dev/consul-template/dedup/nginx/2791a40288346db584903ad8f5c7456a/data?stale=&wait=60000ms: x509: certificate signed by unknown authority

Expected behavior

I'm upgrading from consul-template v0.15.0 to v0.18.5. Previously, I had used the "ca_cert" option to allow the use of a private CA. This worked perfectly. I expect this behavior to continue.

Actual behavior

After migrating my configuration to the new format, it appears that the "ca_cert" option is ignored. Even setting verify=false does not work. As you can see in the logging above, dedup fails with x509: certificate signed by unknown authority. If deduplication is disabled, I receive "unknown authority" errors elsewhere. If I take this same CA certificate and install it into the OS' trusted set of CAs, everything works.

Steps to reproduce

  1. Use consul-template v0.18.5 with SSL enabled for Consul using a self-signed certificate
  2. Try to use either the "ca_cert" option or disable certificate verification
sethvargo commented 7 years ago

Hi @mskeefe

Thank you for opening an issue. Should your certificate verify successfully? It looks like there might be a bug with disabling verification, but we have a ton of test coverage around that. Could you dump your environment and grep for consul env | grep -i consul?

mskeefe commented 7 years ago

Hi @sethvargo

Tried dumping the process but nothing happened. I also tried explicitly setting "dump_signal": "SIGQUIT" but that triggered an error:

* '' has invalid keys: dump_signal

Likewise, I didn't see a command line option for setting the dump signal.

The certificate does work. If I use curl with --cacert <path to my cert> I am able to connect to Consul. Without that option, as expected, I get errors about the certificate issuer not being recognized.

env | grep -i consul  (slightly modified)
CONSUL_DATACENTER=inst_datacenter
CONSUL_NETWORK_PREFIX=173.28.3
MCPI_DOMAIN=consul
CONSUL_ACL_MASTER_TOKEN=XXXXXXXXXXXXXXXXXXXXXXXXXX
CONSUL_SERVER_NODES=consul
CONSUL_ACL_DATACENTER=inst_datacenter
PWD=/opt/inst/consul-template
CONSUL_IP=173.28.3.99
CONSUL_ADDR=https://consul:8600
CONSUL_TEMPLATE_HOME=/opt/inst/consul-template
sethvargo commented 7 years ago

Hi @mskeefe

Thank you for your response. I'm unable to reproduce the config verify bit not parsing correctly:

{
  "consul": {
    "address": "consul:8600",

    "ssl": {
      "ca_cert" : "/opt/inst/dev/ssl/ca/rootCA.pem",
      "enabled": true,
      "verify": false
    },

    "retry": {
      "enabled": true,
      "attempts": 0,
      "backoff": "250ms",
      "max_backoff": "20s"
    }
  }
}
$ consul-template -config=config.json -log-level=debug
2017/06/26 21:51:19.082336 [INFO] (runner) creating new runner (dry: false, once: false)
2017/06/26 21:51:19.082681 [DEBUG] (runner) final config: {"Consul":{"Address":"consul:8600","Auth":{"Enabled":false,"Username":"","Password":""},"Retry":{"Attempts":0,"Backoff":250000000,"MaxBackoff":20000000000,"Enabled":true},"SSL":{"CaCert":"/opt/inst/dev/ssl/ca/rootCA.pem","CaPath":"","Cert":"","Enabled":true,"Key":"","ServerName":"","Verify":false},"Token":"","Transport":{"DialKeepAlive":30000000000,"DialTimeout":30000000000,"DisableKeepAlives":false,"IdleConnTimeout":90000000000,"MaxIdleConns":100,"MaxIdleConnsPerHost":8,"TLSHandshakeTimeout":10000000000}},"Dedup":{"Enabled":false,"MaxStale":2000000000,"Prefix":"consul-template/dedup/","TTL":15000000000},"Exec":{"Command":"","Enabled":false,"Env":{"Blacklist":[],"Custom":[],"Pristine":false,"Whitelist":[]},"KillSignal":2,"KillTimeout":30000000000,"ReloadSignal":null,"Splay":0,"Timeout":0},"KillSignal":2,"LogLevel":"debug","MaxStale":2000000000,"PidFile":"","ReloadSignal":1,"Syslog":{"Enabled":false,"Facility":"LOCAL0"},"Templates":[],"Vault":{"Address":"http://127.0.0.1:8200","Enabled":true,"Grace":15000000000,"RenewToken":true,"Retry":{"Attempts":12,"Backoff":250000000,"MaxBackoff":60000000000,"Enabled":true},"SSL":{"CaCert":"","CaPath":"","Cert":"","Enabled":true,"Key":"","ServerName":"","Verify":true},"Transport":{"DialKeepAlive":30000000000,"DialTimeout":30000000000,"DisableKeepAlives":false,"IdleConnTimeout":90000000000,"MaxIdleConns":100,"MaxIdleConnsPerHost":8,"TLSHandshakeTimeout":10000000000},"UnwrapToken":false},"Wait":{"Enabled":false,"Min":0,"Max":0}}
2017/06/26 21:51:19.082770 [ERR] (cli) runner: runner: client set: consul configuring TLS failed: Error loading CA File: open /opt/inst/dev/ssl/ca/rootCA.pem: no such file or directory

Notice there in the output that the verify value is coming through correctly, unlike in your output where it remains true. Is it possible that you have more than one configuration file and CT is picking up options from a different file?

I'm not sure how ca_cert worked for you in the past, since ca_cert is the raw contents of the certificate, not the path on disk. ca_path is the path on disk to the CA certificate.

Through my investigation, however, I did identify a bug in the underlying Consul API client where the provided TLS config was being overwritten. Sorry about that. I've updated to the latest version of the library where this issue is not present (now present on master).

If you have more information, I would love to get to the bottom of this issu.e

mskeefe commented 7 years ago

From the documentation, it looks like ca_cert is the path to a file:

https://github.com/hashicorp/consul-template

    # This is the path to the certificate authority to use as a CA. This is
    # useful for self-signed certificates or for organizations using their own
    # internal certificate authority.
    ca_cert = "/path/to/ca"

Did this change recently?

sethvargo commented 7 years ago

Hmm.... maybe you're correct. Consul's API client library documentation is a bit misleading, but tracing the code, it should work. Let me play around with this a bit more.

jasonarewhy commented 7 years ago

I am having a similar problem with 0.18.5:

2017/06/28 00:51:55.236030 [WARN] (clients) disabling consul SSL verification
2017/06/28 00:51:55.236054 [INFO] (runner) creating watcher
2017/06/28 00:51:55.236299 [INFO] (runner) starting
2017/06/28 00:51:55.236342 [INFO] (runner) initiating run
2017/06/28 00:51:55.495971 [WARN] (view) health.service(consul@us-east-1|passing): Get https://consul/v1/health/service/consul?dc=us-east-1&passing=1&stale=&wait=60000ms: x509: certificate signed by unknown authority (retry attempt 1 after "250ms")

Current config looks like:

  ssl {
    enabled = true
    verify = false
    ca_cert = "/valid/path/to/ca.cert"
  }

I've tried both verify = false and true - interestingly even when set to false, I'm getting a certificate error.

curl -k https://consul/v1/health/service/consul?dc=us-east-1&passing=1&stale=&wait=60000ms

completes successfully, as does

curl --cacert=/valid/path/to/ca.cert https://consul/v1/health/service/consul?dc=us-east-1&passing=1&stale=&wait=60000ms

so I know the cert should validate.

sethvargo commented 7 years ago

Hi @mskeefe and @jasonarewhy

I just verified that updating the underlying Consul library fixes this issue. I've tried using a custom CA as well as disabling verification and both options work great! This will be fixed in the next release. Sorry about that!

mskeefe commented 7 years ago

@sethvargo Thanks!

sethvargo commented 7 years ago

I released 0.19.0. Can you try it out and verify please? I'm committed to getting this working properly 😄