hashicorp / vault

A tool for secrets management, encryption as a service, and privileged access management
https://www.vaultproject.io/
Other
31.21k stars 4.22k forks source link

TLS Handshake Issues when using Consul and Vault with Github Auth #700

Closed brockoffdev closed 9 years ago

brockoffdev commented 9 years ago

So, I have a config that has Consul and Vault communicating with one another over TLS, having setup a Certificate Authority built for our app, which Vault utiliizes to handshake between it and Consul. Vault, itself, also has TLS enabled on incoming requests. Now, this system works flawlessly for token and TLS based authentication. However, for the ease of our developers, I would love to get Vault authentication working with Github Auth as well.

My problem is it appears that the TLS handshake is not actually happening between Vault and Consul when I utilize the Github Auth method...

==> WARNING: VAULT_TOKEN environment variable set!

  The environment variable takes precedence over the value
  set by the auth command. Either update the value of the
  environment variable or unset it to use the new token.

Error making API request.

URL: PUT http://vault.hi:8200/v1/auth/github/login
Code: 500. Errors:

* Get https://api.github.com/user: x509: certificate is valid for *.github.com, github.com, not consul.hi

....while Vault should be performing the handshake, it appears that Github may be trying to access Consul to verify information? This is problematic, as consul is set to verify a x509 certificate generated by my own CA. As such, Github is not able to retrieve the information I assume it needs, ?regarding "org"?, to verify the authentication.

Any ideas from the community?

My Consul Config and Vault Config Below:

_VAULT:_

backend "consul" {
  address = "consul.hi"
  path = "vault"
  scheme = "https"
  datacenter = "test-hi"
  tls_ca_file = "/vault/ssl/test.crt"
  tls_cert_file= "/vault/ssl/vault.hi.crt"
  tls_key_file = "/vault/ssl/vault.hi.key"
}

listener "tcp" {
  address = "vault.hi:8200"
  tls_cert_file = "/vault/ssl/vault.hi.crt"
  tls_key_file = "/vault/ssl/vault.hi.key"
  tls_min_version = "tls12"
}

_CONSUL:_

{
  "bootstrap_expect": 1,
  "datacenter": "test-hi",
  "data_dir": "/consul",
  "ui_dir": "/consul/dist",
  "log_level": "INFO",
  "node_name": "test",
  "server": true,
  "verify_outgoing": true,
  "verify_incoming": true,
  "ca_file": "/consul/ssl/test.crt",
  "cert_file": "/consul/ssl/consul.hi.crt",
  "key_file": "/consul/ssl/consul.hi.key",
  "client_addr": "10.0.0.1",
  "ports": {
    "https": 443
  },
  "enable_syslog": true
}
jefferai commented 9 years ago

Hi Bryant,

Unfortunately I haven't come up with any great ideas since your posting on the mailing list. I guess your debugging didn't turn up anything useful?

Can you try running curl -v https://api.github.com/user from both your client and each of the Vault machine(s)? I get output like:

*   Trying 192.30.252.126...
* Connected to api.github.com (192.30.252.126) port 443 (#0)
* found 187 certificates in /etc/ssl/certs/ca-certificates.crt
* found 758 certificates in /etc/ssl/certs
* ALPN, offering http/1.1
* SSL connection using TLS1.2 / ECDHE_RSA_AES_128_GCM_SHA256
*        server certificate verification OK
*        server certificate status verification SKIPPED
*        common name: *.github.com (matched)
*        server certificate expiration date OK
*        server certificate activation date OK
*        certificate public key: RSA
*        certificate version: #3
*        subject: C=US,ST=California,L=San Francisco,O=GitHub\, Inc.,CN=*.github.com
*        start date: Tue, 08 Apr 2014 00:00:00 GMT
*        expire date: Wed, 12 Apr 2017 12:00:00 GMT
*        issuer: C=US,O=DigiCert Inc,OU=www.digicert.com,CN=DigiCert SHA2 High Assurance Server CA
*        compression: NULL
* ALPN, server accepted to use http/1.1
> GET /user HTTP/1.1
> Host: api.github.com
> User-Agent: curl/7.43.0
> Accept: */*
> 
< HTTP/1.1 401 Unauthorized
< Server: GitHub.com
< Date: Thu, 15 Oct 2015 15:56:44 GMT
< Content-Type: application/json; charset=utf-8
< Content-Length: 101
< Status: 401 Unauthorized
< X-RateLimit-Limit: 60
< X-RateLimit-Remaining: 58
< X-RateLimit-Reset: 1444928201
< X-GitHub-Media-Type: github.v3
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: deny
< Content-Security-Policy: default-src 'none'
< Access-Control-Allow-Credentials: true
< Access-Control-Expose-Headers: ETag, Link, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval
< Access-Control-Allow-Origin: *
< X-GitHub-Request-Id: 47E8152A:65C9:62FB192:561FCCBC
< Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
< X-Content-Type-Options: nosniff
< 
{
  "message": "Requires authentication",
  "documentation_url": "https://developer.github.com/v3"
}
* Connection #0 to host api.github.com left intact

I want to make sure that there isn't some kind of strange DNS mapping somewhere.

brockoffdev commented 9 years ago

Hey Jeff-

Thanks for the help. Nope, nothing abnormal DNS wise. This is what I'm seeing on Ubuntu...

* Hostname was NOT found in DNS cache
*   Trying 192.30.252.125...
* Connected to api.github.com (192.30.252.125) port 443 (#0)
* successfully set certificate verify locations:
*   CAfile: none
  CApath: /etc/ssl/certs
* SSLv3, TLS handshake, Client hello (1):
* SSLv3, TLS handshake, Server hello (2):
* SSLv3, TLS handshake, CERT (11):
* SSLv3, TLS handshake, Server key exchange (12):
* SSLv3, TLS handshake, Server finished (14):
* SSLv3, TLS handshake, Client key exchange (16):
* SSLv3, TLS change cipher, Client hello (1):
* SSLv3, TLS handshake, Finished (20):
* SSLv3, TLS change cipher, Client hello (1):
* SSLv3, TLS handshake, Finished (20):
* SSL connection using ECDHE-RSA-AES128-GCM-SHA256
* Server certificate:
*    subject: C=US; ST=California; L=San Francisco; O=GitHub, Inc.; CN=*.github.com
*    start date: 2014-04-08 00:00:00 GMT
*    expire date: 2017-04-12 12:00:00 GMT
*    subjectAltName: api.github.com matched
*    issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 High Assurance Server CA
*    SSL certificate verify ok.
> GET /user HTTP/1.1
> User-Agent: curl/7.35.0
> Host: api.github.com
> Accept: */*
>
< HTTP/1.1 401 Unauthorized
* Server GitHub.com is not blacklisted
< Server: GitHub.com
< Date: Thu, 15 Oct 2015 16:42:08 GMT
< Content-Type: application/json; charset=utf-8
< Content-Length: 101
< Status: 401 Unauthorized
< X-RateLimit-Limit: 60
< X-RateLimit-Remaining: 59
< X-RateLimit-Reset: 1444930928
< X-GitHub-Media-Type: github.v3
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: deny
< Content-Security-Policy: default-src 'none'
< Access-Control-Allow-Credentials: true
< Access-Control-Expose-Headers: ETag, Link, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval
< Access-Control-Allow-Origin: *
< X-GitHub-Request-Id: 36AF2A4D:AB2D:869CC44:561FD760
< Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
< X-Content-Type-Options: nosniff
<
{
  "message": "Requires authentication",
  "documentation_url": "https://developer.github.com/v3"
}
* Connection #0 to host api.github.com left intact
brockoffdev commented 9 years ago

@jefferai sorry for the repost, btw, it's just an unfortunate blocker. I can move to gossip based encryption on the consul node(s), but obviously that wouldn't be a long term solution.

My debugging didn't turn up much at all, actually. Essentially, from what I have been able to learn, it appears that Vault is simply having Github attempt to connect directly to Consul to perform a write.

Interestingly enough, this same thing occurs when you attempt to perform a READ operation from AWS. Example:

vault read aws/creds/iam
Error reading aws/creds/iam: Error making API request.

URL: GET https://vault.hi:8200/v1/aws/creds/iam
Code: 400. Errors:

* Error creating IAM user: RequestError: send request failed
caused by: Post https://iam.amazonaws.com/: x509: certificate is valid for iam.amazonaws.com, not consul.hi

So this is unfortunately not a bug relegated to only the Github Auth plugin.

This handshake error does make sense, if it is working the way I believe: Vault receives request --> AWS is contacted --> AWS tried to verify AWS credentials (likely) from Consul --> ERROR: NO TLS CERT.

So my assumption is that the verify_incoming is causing the issue. What really needs to be happening, if this is indeed the issue, is that Vault needs to retrieve these values, and return them to whatever API endpoint it is attempting to hit, rather than redirecting the request to Consul. Then complete any write operations itself to Consul.

I could be wrong about this, but this is my working theory as to what is going wrong currently.

Thoughts?

EDIT: should have mentioned, when using the http to consul protocol, it does work perfectly fine.

jefferai commented 9 years ago

I don't have any thoughts yet, but I'll dig into this more. Nothing should be redirecting any request to Consul, because no values stored in Consul are cleartext...there would be no way for any verification or lookup to succeed (that's kind of the point! :-) )

I was going to ask about whether this failed for you when using an unencrypted Consul connection. Knowing that scenario works helps nail it down.

BTW, are you running a local Consul agent? If so, any reason you're connecting to a Consul server instead of simply connecting to the local agent and letting it select the right server?

jefferai commented 9 years ago

@brockoffdev I had a brainwave and found something that would affect only Consul, AWS, and GitHub -- looks quite promising. I will likely push this change to master anyways once I'm done with it and after running unit tests. I could send you an updated binary to try, or you could built at that point.

brockoffdev commented 9 years ago

@jefferai thanks so much for helping out with this! I'll check out the changes in the PR.

As an update, your idea to run a consul agent directly on the Vault box did work with the local loopback. I'm curious how your update would work with my previous build; I'll give it a shot at some point in the next 24 hours.

Thanks so much again.

jefferai commented 9 years ago

Basically, when using TLS the Consul API library makes some changes to the HTTP client it uses. Unfortunately, so does the GitHub library, and the AWS library...and it turns out they were all using the same client: the http package's DefaultClient...something that I'm starting to feel is one of the worst mistakes in the entire Go standard library. It's a global variable and its really easy not to know what other code may be modifying it. I've dealt with two race conditions in two separate programs caused by this, and it is easy for it not to cause problems (especially ones that are understandable for unit tests) until it causes really weird problems.

I'm pretty sure this is the fix... let me know how the PR works for you!

brockoffdev commented 9 years ago

Hey @jefferai my sincerest apologies for not getting back to you sooner. Crazy past few days, so I hadn't had the chance to test out the new binary for #702.

That being said, I can confirm that the binary does seem to have fixed the issue! I disabled my local consul agent, and once again directed tls traffic from the vault server to a dedicated consul server, and both AWS and Github handshakes were appropriately handled. Great work! :+1:

jefferai commented 9 years ago

Nice! Glad to hear it. I'm in the process of going through HC's other projects and tearing out usage of http.DefaultClient there and in their dependencies as well, to make sure this doesn't bite anyone else.

Thanks for all the help reporting and debugging!

brockoffdev commented 9 years ago

@jefferai no problem, thanks so much for taking it on!