hashicorp / vault

A tool for secrets management, encryption as a service, and privileged access management
https://www.vaultproject.io/
Other
30.94k stars 4.18k forks source link

vault agent should be more verbose about tls_server_name errors #14202

Open grahamc opened 2 years ago

grahamc commented 2 years ago

Describe the bug

If I configure a Vault Agent to connect to a server but provide an invalid tls_server_name, the agent fails in a hard to diagnose way.

To reproduce With the following configuration:

vault {
    address = "https://example.com"
    ca_cert = "./cacert.pem"
}

we get a reasonable error:

==> Vault agent started! Log data will stream in below:

==> Vault agent configuration:

                     Cgo: enabled
               Log Level: info
                 Version: Vault v1.9.3
             Version Sha: v1.9.3

2022-02-22T15:21:33.844-0500 [INFO]  template.server: starting template server
2022-02-22T15:21:33.844-0500 [INFO] (runner) creating new runner (dry: false, once: false)
2022-02-22T15:21:33.844-0500 [INFO]  auth.handler: starting auth handler
2022-02-22T15:21:33.844-0500 [INFO]  auth.handler: authenticating
2022-02-22T15:21:33.844-0500 [INFO]  sink.server: starting sink server
2022-02-22T15:21:33.844-0500 [INFO] (runner) creating watcher
2022-02-22T15:21:33.957-0500 [ERROR] auth.handler: error authenticating: error="Put \"https://example.com/v1/auth/approle/login\": x509: certificate signed by unknown authority" backoff=1s

However we get no diagnostics if we add a mismatched tls_server_name directive. For example, with the following configuration:

vault {
    address = "https://example.com"
    ca_cert = "./cacert.pem"
    tls_server_name = "totally-bogus.com"
}

we get:

$ vault agent -config=./agent-config.hcl -log-level=debug
==> Vault agent started! Log data will stream in below:

==> Vault agent configuration:

                     Cgo: enabled
               Log Level: debug
                 Version: Vault v1.9.3
             Version Sha: v1.9.3

2022-02-22T15:23:09.879-0500 [INFO]  auth.handler: starting auth handler
2022-02-22T15:23:09.879-0500 [INFO]  auth.handler: authenticating
2022-02-22T15:23:09.879-0500 [INFO]  template.server: starting template server
2022-02-22T15:23:09.879-0500 [INFO]  sink.server: starting sink server
2022-02-22T15:23:09.879-0500 [INFO] (runner) creating new runner (dry: false, once: false)
2022-02-22T15:23:09.879-0500 [DEBUG] (runner) final config: {"Consul":{"Address":"","Namespace":"","Auth":{"Enabled":false,"Username":"","Password":""},"Retry":{"Attempts":12,"Backoff":250000000,"MaxBackoff":60000000000,"Enabled":true},"SSL":{"CaCert":"","CaPath":"","Cert":"","Enabled":false,"Key":"","ServerName":"","Verify":true},"Token":"","Transport":{"CustomDialer":null,"DialKeepAlive":30000000000,"DialTimeout":30000000000,"DisableKeepAlives":false,"IdleConnTimeout":90000000000,"MaxIdleConns":100,"MaxIdleConnsPerHost":33,"TLSHandshakeTimeout":10000000000}},"Dedup":{"Enabled":false,"MaxStale":2000000000,"Prefix":"consul-template/dedup/","TTL":15000000000,"BlockQueryWaitTime":60000000000},"DefaultDelims":{"Left":null,"Right":null},"Exec":{"Command":"","Enabled":false,"Env":{"Denylist":[],"Custom":[],"Pristine":false,"Allowlist":[]},"KillSignal":2,"KillTimeout":30000000000,"ReloadSignal":null,"Splay":0,"Timeout":0},"KillSignal":2,"LogLevel":"DEBUG","MaxStale":2000000000,"PidFile":"","ReloadSignal":1,"Syslog":{"Enabled":false,"Facility":"LOCAL0","Name":"consul-template"},"Templates":[{"Backup":false,"Command":"","CommandTimeout":30000000000,"Contents":"","CreateDestDirs":true,"Destination":"./example.output","ErrMissingKey":false,"Exec":{"Command":"","Enabled":false,"Env":{"Denylist":[],"Custom":[],"Pristine":false,"Allowlist":[]},"KillSignal":2,"KillTimeout":30000000000,"ReloadSignal":null,"Splay":0,"Timeout":30000000000},"Perms":0,"Source":"./example.ctmpl","Wait":{"Enabled":false,"Min":0,"Max":0},"LeftDelim":"","RightDelim":"","FunctionDenylist":[],"SandboxPath":""}],"Vault":{"Address":"https://example.com","Enabled":true,"Namespace":"","RenewToken":false,"Retry":{"Attempts":12,"Backoff":250000000,"MaxBackoff":60000000000,"Enabled":true},"SSL":{"CaCert":"./cacert.pem","CaPath":"","Cert":"","Enabled":true,"Key":"","ServerName":"totally-bogus.com","Verify":true},"Transport":{"CustomDialer":null,"DialKeepAlive":30000000000,"DialTimeout":30000000000,"DisableKeepAlives":false,"IdleConnTimeout":90000000000,"MaxIdleConns":100,"MaxIdleConnsPerHost":33,"TLSHandshakeTimeout":10000000000},"UnwrapToken":false,"DefaultLeaseDuration":300000000000},"Wait":{"Enabled":false,"Min":0,"Max":0},"Once":false,"BlockQueryWaitTime":60000000000}
2022-02-22T15:23:09.880-0500 [INFO] (runner) creating watcher
^C==> Vault agent shutdown triggered
2022-02-22T15:23:49.604-0500 [INFO]  sink.server: sink server stopped
2022-02-22T15:23:49.604-0500 [INFO]  sinks finished, exiting
2022-02-22T15:23:49.604-0500 [INFO] (runner) stopping
2022-02-22T15:23:49.604-0500 [DEBUG] (runner) stopping watcher
2022-02-22T15:23:49.604-0500 [DEBUG] (watcher) stopping all views
2022-02-22T15:23:49.604-0500 [INFO]  template.server: template server stopped
2022-02-22T15:24:09.880-0500 [ERROR] auth.handler: error authenticating: error="context deadline exceeded" backoff=1s
2022-02-22T15:24:09.880-0500 [INFO]  auth.handler: auth handler stopped

Some notes:

  1. it never prints out anything about the tls name mismatch
  2. it never says where it is connecting, except for in the giant config dump. This is interesting because I came across this issue due to $VAULT_ADDR (silently) overriding the configuration file.
  3. after the Ctrl-C it takes a full 30s (a whole context deadline exceeded interval) before Vault willingly dies.

Expected behavior

Some log messages like:

2022-02-22T15:21:33.957-0500 [ERROR] auth.handler: error authenticating: error="TLS Negotiation with https://example.com failed: the remote's server name 'example.com' does not match the configured tls_server_name 'totally-bogus.com'." backoff=1s

Environment:

Vault server configuration file(s):

vault {
    address = "https://example.com"
    ca_cert = "./cacert.pem"
    tls_server_name = "totally-bogus.com"
}

auto_auth {
    method {
        type = "approle"
        config = {
            role_id_file_path = "role_id"
            secret_id_file_path = "secret_id"
            remove_secret_id_file_after_reading = false
        }
    }
}

template_config {
    error_on_missing_key = true
}

template {
    contents = "hi"
    destination = "./example.output"
}

Additional context n/a

ryowright commented 2 years ago

Currently taking a look at this issue.

ryowright commented 2 years ago

image

ryowright commented 2 years ago

I believe I managed to fix this so I will open a PR.

8080129594 commented 2 years ago

https://example.com