Closed wojtekmach closed 6 months ago
Pinging @voltone, any guidance would be very appreciated :)
The standard verify_fun
callback selected by verify_peer
is not intended to handle self-signed peer certificates. Normally to use a private PKI you'd generate a self-signed CA and issue server certificates from there (directly or through intermediate CAs). Such setups can be supported by simply passing the custom root CA to the cacertfile
or cacerts
option.
The assumption presumably is that for peer-to-peer connections you'd use something like PSK or PBKDF cipher suites, rather than certificates.
It is possible to handle self-signed peer certificates by passing a custom verify_fun
, e.g.:
:ssl.connect(
'localhost',
4433,
cacerts: [],
verify_fun: {fn
cert, {:bad_cert, :selfsigned_peer}, state ->
# Decide if `cert` is the expected certificate!
{:valid, state}
_, {:extension, _}, state ->
{:unknown, state}
end, nil}
)
Couple of things to note:
verify: :verify_peer
is not needed: it would just select the default verify_fun
, which we are overriding anywaycacerts
is empty, because there is no trusted CA (you could set it to the self-signed peer's certificate, but it would not be used)verify_fun
you need to decide if this is really the expected certificate before returning :valid
. This can be done by extracting the public key from the server's cert and the local copy and comparing them, or by comparing the entire certificate (e.g. in DER format).The AWS Aurora issue is probably due to a wildcard certificate. If you are ok with supporting only OTP 21.0 or later you can enable wildcard certificate matching by passing the following ssl option:
customize_hostname_check: [match_fun: :public_key.pkix_verify_hostname_match_fun(:https)]
Properly supporting wildcard certs with older OTP versions is very hard without using the ssl_verify_fun dependency.
I think it's reasonable to add ssl_verify_fun as optional dependency, if SSL is being used by myxql explicitly (e.g. ssl: true
)
However.. what about CA? There are certifi and castore. I'm not sure it's common for mysql server to use certificates signed by trusted CA.
If most mysql client library/tools honor system CA (which usually includes trusted CA) - then it may make sense to use those trust CA by default
@voltone and @chulkilee sorry for no-followups, thank you for your feedback.
Given we don't necessarily want to require OTP 21+ yet or add ssl_verify_fun, we decided to warn when users leave :ssl_opts
unset and point to a basic SSL guide which basically points to ssl_verify_fun: https://github.com/elixir-ecto/myxql/pull/82. Perhaps when we do require OTP 21 we'll revisit this, but I feel like given folks' server installations may differ quite a bit it might be hard to come up with reasonable defaults. What do you think? Any feedback on the PR would be appreciated, thank you again for your help with this.
FYI, the solution proposed by @voltone which make use of the following line does not work on AWS Aurora:
customize_hostname_check: [match_fun: :public_key.pkix_verify_hostname_match_fun(:https)]
So it seem like the certificate being a wildcard is not the only validation error that pops (maybe the order at which the certificates are delivered in the TLS certificate exchange?). Luckily, like @wojtekmach mentioned, ssl_verify_fun works.
I have done this testing using Postgres and not Mysql but I assume it makes no matter in our context.
So it seem like the certificate being a wildcard is not the only validation error that pops (maybe the order at which the certificates are delivered in the TLS certificate exchange?)
The other thing you'll likely have to set with AWS is the :depth
parameter: the default value of 1 is unlikely to be sufficient for the certificates AWS use on their RDS instances. You may want to try with depth: 3
.
Does an error get logged when the connection fails?
When trying with:
ssl_opts = [
verify: :verify_peer,
cacertfile: "/path/to/rds-ca-2019-root.pem",
customize_hostname_check: [match_fun: :public_key.pkix_verify_hostname_match_fun(:https)],
depth: 3
]
{:ok, pid} = Postgrex.start_link(hostname: "database-2-instance-1.cwlgt8axnrkn.eu-west-1.rds.amazonaws.com", username: "postgres", password: "XXXXX", database: "postgres", ssl: true, ssl_opts: ssl_opts)
I get the following:
[info] TLS :client: In state :certify at ssl_handshake.erl:1768 generated CLIENT ALERT: Fatal - Handshake Failure
- {:bad_cert, :hostname_check_failed}
Here are the 2 certificates delivered by the server during the handshake, as seen by wireshark:
The CA cert file comes from https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.SSL.html
In Wireshark, the commonName
part of the server certificate's Subject appears to be truncated. Can you tell if this is just a presentation issue, or if the certificate Subject really does specify a CN of database-2-instance-1.cwlgt8axnrkn.eu-west-1.r
?
The maximum length allowed for the CN component is 64, so there should be enough room for the full hostname, but if the value is indeed truncated it might explain your issue. I'd have to dig a little deeper once I have the time...
This is just a presentation issue. In Wireshark, when that node is expanded, I can see that the Subject is the full hostname. I guess the issue is a bit more complex. Any any case, thanks for your help.
@voltone We've recently shipped being able to connect from any MySQL to PlanetScale (a cloud hosted Vitess / MySQL solution) and one of the first questions we got was Elixir support :smile:.
I tried to set it up, but I had to do a lot of complicated things sadly and the discussion in this issue is touching on a lot of these points I think.
One of the main issues is that I could not get it working even with the above explanation and customize_hostname_check
on a recent OTP, but I think that I have found the cause of that issue.
hostname = "uxht6i2xcb2t.us-east-1.psdb.cloud"
{:ok, pid} = MyXQL.start_link(username: "zqz7nlrz80im",
database: "test",
hostname: hostname,
password: "pscale_pw_X",
port: 3306,
ssl: true,
ssl_opts: [
verify: :verify_peer,
cacertfile: CAStore.file_path(),
server_name_indication: String.to_charlist(hostname),
customize_hostname_check: [
match_fun: :public_key.pkix_verify_hostname_match_fun(:https)
]
]
)
I used CAStore
so I didn't have to provide a per system path. Each system has different default root paths. It's also why we had to document https://docs.planetscale.com/reference/secure-connections#ca-root-configuration and I've also submitted to other drivers like MySQL itself to use system roots by default (see https://github.com/mysql/mysql-server/pull/358). Removing the need to provide a cacertfile
and to default to system roots would greatly simplify setup.
The key bit though here, is this one:
server_name_indication: String.to_charlist(hostname),
The issue is that with MySQL an existing socket is upgraded to a TLS socket. This means it doesn't follow the regular path to create a TLS connection, where the hostname is provided in the connect call. Because the socket is already established at this point, it does not know about the original hostname that it was connecting with.
This means that the public_key.pkix_verify_hostname_match_fun
callback gets the IP address as the Host
and it verifies against the DNS SANs on the certificate and it fails. By providing a value for server_name_indication
, the hostname validation will use that hostname instead and the connection works. It also needs a String.to_charlist
because it otherwise complains that the option is invalid for a regular string.
@chaodhib I suspect the above fix will also solve this for AWS Aurora or for any database that provides a valid signed certificate with a hostname (I know Azure MySQL does as well for example).
All in, this is quite complex compared to other drivers, so in the context of this issue title, I was wondering if you'd be willing to consider the following:
cacertfile
that points at the current system roots. This way it doesn't need to provided and there's no CAStore
dependency or manual configuration of a path that's different on different operating systems.server_name_indication
is set when the socket is upgraded to TLS so that it will be used for verification. :public_key.pkix_verify_hostname_match_fun(:https)
as the default hostname check function if the option is not specified in another way? Is requiring OTP 21+ an option at this point? Or would it be possible to only set this default on OTP 21+? Use a default cacertfile that points at the current system roots. This way it doesn't need to provided and there's no CAStore dependency or manual configuration of a path that's different on different operating systems.
Unfortunately no one has, at least until now, managed to write a reliable and portable function for selecting the system CA trust store. It is not trivial to make this work across various Linux distributions, to say nothing of Mac / Windows. This is why most BEAM applications bring their own trust store in the form of a package (castore
or certifi
). I personally prefer to use the system store, by injecting a platform-specific path at startup, e.g. through an environment variable.
Unfortunately no one has, at least until now, managed to write a reliable and portable function for selecting the system CA trust store. It is not trivial to make this work across various Linux distributions
I think in practice this is definitely doable, but I agree that it doesn't really belong in a driver. Take for example the approach from Go (see https://github.com/golang/go/blob/master/src/crypto/x509/root_linux.go) which has a list of filenames that are checked. It also works for Mac which exposes a single path. Windows is indeed a whole different game. But all this would ideally be something handled in Elixir / Erlang where it has to be written only once and it can be reused.
For me the far more surprising bug here has been that server_name_indication: String.to_charlist(hostname)
and that the original provided hostname is not used for the certificate validation. That case I think would classify as a bug and not only an optimization to improve the user experience?
To Postgrex users, I had to add the same magic sauce too.
config :my_app, MyApp.Repo,
ssl: true,
ssl_opts: [
verify: :verify_peer,
cacerts: [database_ca_cert],
server_name_indication: '#{database_hostname}',
customize_hostname_check: [
match_fun: :public_key.pkix_verify_hostname_match_fun(:https)
]
]
ssl_opts: [ verify: :verify_peer, server_name_indication: '#{database_hostname}', customize_hostname_check: [ match_fun: :public_key.pkix_verify_hostname_match_fun(:https) ] ]
@voltone cacerts/cacertfile aside for a second, do you think we can ship with these defaults? I'm fine with setting these only on OTP 21+ (or just requiring that release)
In terms of cacerts/cacertfile, the landscape changed since OTP 25 ships with os-provided store. So we could use that or at the very least nudge people towards it (like Mint does https://github.com/elixir-mint/mint/pull/354). To provide best user experience, I'd be OK with having an optional dependency on CAStore too, just like Mint does.
@dbussink and @jayjun and others, thank you so much for looking into this.
@voltone cacerts/cacertfile aside for a second, do you think we can ship with these defaults? I'm fine with setting these only on OTP 21+ (or just requiring that release)
Until recently the default value for depth
was 1. This may not be sufficient for e.g. AWS RDS instances. The new default is 10, maybe it's worth including depth: 10
to make sure this value is also used with older OTP versions.
Some older OTP versions might have weak cipher suites enabled by default, but I think that's going to be harder to fix here. I suppose we should just encourage people to use a recent OTP versions if possible.
@voltone great, thank you!
@dbussink RE:
hostname = "uxht6i2xcb2t.us-east-1.psdb.cloud"
{:ok, pid} = MyXQL.start_link(username: "zqz7nlrz80im",
database: "test",
hostname: hostname,
password: "pscale_pw_X",
port: 3306,
ssl: true,
ssl_opts: [
verify: :verify_peer,
cacertfile: CAStore.file_path(),
server_name_indication: String.to_charlist(hostname),
customize_hostname_check: [
match_fun: :public_key.pkix_verify_hostname_match_fun(:https)
]
]
)
fyi, with latest version, MyXQL v0.7.0, it is now:
{:ok, pid} = MyXQL.start_link(
username: "zqz7nlrz80im",
database: "test",
hostname: "uxht6i2xcb2t.us-east-1.psdb.cloud",
password: "pscale_pw_X",
ssl: [:public_key.cacerts_get()]
)
If you have Elixir docs somewhere it'd be nice to update them and if they are in a public repo, I'm happy to send a PR, lmk! (And sorry for the wait. 😅 )
We should ship with secure by default SSL configuration [1]. Currently we use OTP defaults.
I couldn't get this reliably to work, here's what I tried when setting
verify: :verify_peer
.First of all, self-signed certificates automatically created by MySQL doesn't seem to work with verification:
(https://dev.mysql.com/doc/refman/5.7/en/using-encrypted-connections.html)
I tried manually creating self-signed certs using [2] with different Common Name but no luck too. Same error:
Generating valid self-signed certificates is crucial to be able to run this on CI.
I then tried running tests against AWS Aurora (Provisioned, MySQL 5.7).
First, got protocol version error:
That instance only supports TLSv1 so we force that versions:
Finally, with ssl_verify_fun I was able to get this to work:
However, I'd prefer not to add a dependency.
[1] https://blog.voltone.net/post/23 [2] https://dev.mysql.com/doc/refman/8.0/en/creating-ssl-files-using-openssl.html