Open a-robinson opened 6 years ago
After some discussion, @bobvawter and I have decided to go forward with using wildcard certs inside Kubernetes as described above. I don't believe this opens up any new attack vectors given that you need Kubernetes RBAC permissions to read the Kubernetes secrets in the relevant namespace, and if you have that then you can already grab the root client certificate and access all the data in the database anyway.
We're going to move forward with changing the config files and instructions as long as no one can think of any big concerns. cc @mberhault @bdarnell as the people most likely to know of concerns.
Seems fine to me. Do we need both *.cockroachdb
and *.cockroachdb.default.svc.cluster.local
? The fully-qualified name seems safer (since theoretically cockroachdb
could be a TLD (or even an unqualified name on the local network) and resolve in unexpected ways), although I can't think of any specific issues.
I can double check. I think it will work, but my concern is that attempts by a node to connect to itself at its own hostname won't work, because the pods are given the hostname cockroachdb-x.cockroachdb
but their fully-qualified hostname (i.e. hostname -f
) is cockroachdb-x.cockroachdb.default.svc.cluster.local
. They're told to talk to each other at the full version of the name, but our internal use of os.Hostname
to get our own address may cause problems.
We do provide an appropriate --advertise-addr
, but seeing attempts to connect to cockroachdb-0.cockroachdb
in the logs from https://forum.cockroachlabs.com/t/secure-cockroachdb-cluster-on-aws-eks/1824 has me worried. It's possible that's from user error, though.
We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 5 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!
...still would like to see a better process for certs
We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!
a comment will keep it active
One of the most common bits of feedback we get about our secure kubernetes instructions is that the certificate approval process is annoyingly manual. Automatically approving all certificates is a risky practice, though, so we should find some middle ground that improves usability without compromising security.
@mberhault suggested that we may be able to use wildcard certificates - this would mean that we sign a wildcard certificate once for all nodes and any nodes that come up during initial bootstrap or from later scaling could all use the same node cert. For example, we'd sign the cert to be valid for
*.cockroachdb
and*.cockroachdb.default.svc.cluster.local
for a cockroachdb cluster running in the default namespace. You'd still have to approve the one node cert and the root client cert, but that'd be it unless you wanted to create additional user certs.@nstewart, @kannanlakshmi, and others have suggested adding automation that would auto-approve the certs for the initial pods in the statefulset, since we know how many there will be. This could potentially just be a little script users could run that waits for the expected certificates to appear and approves them once they do.
Jira issue: CRDB-5759