Open knz opened 4 years ago
Related: cockroachdb/cockroach#49918
cc @thtruo @aaron-crl for tracking
Is the intent here to move all connections to the crdb process to be secure? And by secure, i mean secure in the way that cockroachdb expects? We manage TLS as well as cert authN/AuthZ outside of crdb as a separate process on the database instances, and thus run crdb in insecure
mode because it's fronted by the other process. The reason for this is our authN/authZ workflow is slightly non-standard and the effort to integrate that into crdb (as far we knew a year ago) seemed daunting.
Perhaps I can take another look at our golang support for our authN/authZ as well as the cockroach code to see if the integration could work via some extension points/plugins.
Hi Jason, thank you for the feedback.
Once the --insecure
flag is gone, you will be able to achieve the same as your current set up with the following configuration
trust
the login principal without additional authentication (since you authenticate externally). You would then narrow down the trust
rule to trust only the address of your external authentication service (and not arbitrary connections).Meanwhile, we'll encourage you to keep node-to-node connections secure using TLS; if you prefer not to set up TLS certs manually, we'll have some new experience (see cockroachdb/cockroach#51991) which will simplify the setup.
Does that sound good to you?
@jasobrown a question for you though: does your set up use a single SQL account for all operations, or do you use separate SQL accounts with different privileges? Can you outline a little bit how your authorization looks like?
edit: discussed this with jason offline
The word "insecure" gives the wrong impression that CockroachDB is not a secure database.
Citation needed. Are people really getting confused by this? I think the real issue with --insecure
mode is that because of the lack of grpc authentication, it's more insecure than people expect.
Cluster is secure in all cases.
Security is not binary. The question is always "secure against what threats?" Some of these cases are secure against threats that others are not, so it's misleading to say that they're all equally secure (of course they may be used in systems that are secure against those threats if security is provided at other layers)
And that's really what we're talking about here - traditionally we've just had all-or-nothing, with the no-protection-at-all --insecure
mode or complex manually-managed certificates. Here we're adding multiple in-between modes. And then, because some of these intermediate modes will be strictly more secure than --insecure
but no more difficult to use, we'll eventually eliminate --insecure
mode. I think leading with the removal of --insecure
is causing some surprise among users (although more than I expected - I thought we had discouraged this mode enough that no one would be using it in production).
That flowchart scares me - there's a lot of decision points there. A lot of them are new. The goal of cockroachdb/cockroach#51991 and eliminating insecure mode is to reduce the number of options that most users experience, by steering nearly all users to the "internally-managed TLS" path. I think we need to reconsider the amount of flexibility we're providing here, and what decisions the user really needs to make.
For example, I don't think we ever want to give the "non-TLS SQL" option this much emphasis. CockroachCloud will always need to use TLS, so we must have a good user experience for TLS (perhaps by working with driver implementations on things like cockroachdb/cockroach#32932).
Existing clusters previously run with "insecure mode"
Are there enough of these that we need to provide a no-downtime upgrade path? This seems like it could add quite a bit of complexity (can you upgrade from any security mode to any other, or only certain ones?)
We know from experience that customers who advocate for "insecure mode" really only care about point 3: they want TLS-less SQL connections
This is definitely not true. There have been some recent examples of this but I think the most common reason people resist setting their clusters up securely is about all the TLS setup, especially for node-to-node.
Possibly solve cockroachdb/cockroach#54007 and/or advance cockroachdb/cockroach#51991 to enable secure RPC without manual TLS configs
This seems like a requirement, not just a "possibly".
The word "insecure" gives the wrong impression that CockroachDB is not a secure database.
Citation needed.
This came up internally multiple times.
Are people really getting confused by this? I think the real issue with
--insecure
mode is that because of the lack of grpc authentication, it's more insecure than people expect.
This is pointed out prominently at the top of the issue description. But I'll take your point, as well as this one:
because some of these intermediate modes will be strictly more secure than
--insecure
but no more difficult to use, we'll eventually eliminate--insecure
mode. I think leading with the removal of--insecure
is causing some surprise among users.
I have adjusted the title of the issue to emphasize the new things / improvements and pulled that notion as first sentences/paragraphs in the issue description.
Cluster is secure in all cases.
Security is not binary. The question is always "secure against what threats?"
You and I know that security is not binary but the point here is that all the proposed options keep all internal security controls active, which --insecure=true
does not.
Anyway you are right that we need to talk about threats. Adding a table in the issue description instead of the flowchart. Bram helped me understand that a list of threats is more easy to use than a flowchart anyway.
Some of these cases are secure against threats that others are not, so it's misleading to say that they're all equally secure
"say they're all equally secure" - that's a strawman. Nobody wrote this anywhere.
That flowchart scares me - there's a lot of decision points there. [...] I think we need to reconsider the amount of flexibility we're providing here, and what decisions the user really needs to make.
Point granted. I've reduced the scope accordingly.
For example, I don't think we ever want to give the "non-TLS SQL" option this much emphasis. CockroachCloud will always need to use TLS, so we must have a good user experience for TLS (perhaps by working with driver implementations on things like cockroachdb/cockroach#32932).
It seems clear that we need to emphasize the CC use case first and foremost, and thus preserve TLS options as the main recommendation (and main recommended scenario). However we do have serious $$$ at risk if we don't offer other options.
Existing clusters previously run with "insecure mode"
Are there enough of these that we need to provide a no-downtime upgrade path? This seems like it could add quite a bit of complexity (can you upgrade from any security mode to any other, or only certain ones?)
There are at least a few customers asking (those big accounts who didn't like our TLS and went to prod with --insecure
, despite our recommendations to the countrary). But we can narrow down the scope of the upgrade path to support just those customers, yes.
We know from experience that customers who advocate for "insecure mode" really only care about point 3: they want TLS-less SQL connections
This is definitely not true. There have been some recent examples of this but I think the most common reason people resist setting their clusters up securely is about all the TLS setup, especially for node-to-node.
Point taken. Adjusted the text accordingly.
Possibly solve cockroachdb/cockroach#54007 and/or advance cockroachdb/cockroach#51991 to enable secure RPC without manual TLS configs
This seems like a requirement, not just a "possibly".
The word "possibly" only pertained to cockroachdb/cockroach#54007 (which you and I would agree is probably less important). Issue cockroachdb/cockroach#51991 is definitely critical.
I also just realized that the TLS story could also be simplified by issuing a common TLS client cert for all SQL users (and give them separate passwords to distinguish them)
Apologies for the late response but there are three things that strike me here:
1) --insecure
should probably do a better job of explicitly stating what is insecure mode. Perhaps we should include your list below in the output from the flag (formatted to CLI of course):
--insecure does the following:
deactivates TLS handshakes for node-node connections
deactivates TLS handshakes for node-client RPC connections
deactivates TLS handshakes for node-client SQL connections over TCP
deactivates TLS handshakes for node-client HTTP connections
deactivates node-to-node authentication
deactivates HTTP authentication
deactivates SQL authentication
deactivates SQL authorization
deactivate certain SQL features, so as to not create the illusion of security
2) There's a lot here to unpack and given the potential impact to developers and customers perhaps this should be moved to an RFC given the current prevalence of this flag and depth of discussion.
3) I don't feel we should mark a feature as deprecated until we have a supported alternative (or at least support for the majority of the usage). I don't feel we have that yet given that our own documentation still includes this flag for tutorials and guides.
Regarding point 1: there is now a clearer warning than before. The warning reads as follows:
*
* WARNING: ALL SECURITY CONTROLS HAVE BEEN DISABLED!
*
* This mode is intended for non-production testing only.
*
* In this mode:
* - Your cluster is open to any client that can access any of your IP addresses.
* - Intruders with access to your machine or network can observe client-server traffic.
* - Intruders can log in without password and read or write any data in the cluster.
* - Intruders can consume all your server's resources and cause unavailability.
*
*
* INFO: To start a secure server without mandating TLS for clients,
* consider --accept-sql-without-tls instead. For other options, see:
*
* - https://go.crdb.dev/issue-v/53404/v20.2
* - https://www.cockroachlabs.com/docs/v20.2/secure-a-cluster.html
*
Regarding point 2: yes this work will be in a RFC of course.
Regarding point 3: agreed
Epic: CRDB-12037
The
--insecure
flag disables a number of security controls which are completely expected even for developers or folk trying out CockroachDB.Instead, we aim to ensure that all clusters are secure, but offer new secure options where the user can choose the combination of security features that best matches their needs and their environment.
(Additionally, the word "insecure" gives the wrong impression that CockroachDB is not a secure database. The "insecure mode" only really exists for internal testing by the CockroachDB team and should not have been exposed to users from the start.)
What you need to know in a post-insecure world
CockroachDB aims to remain easy to use when all clusters are secured!
New clusters
Simplified decision making:
cockroach demo
, automatically secureThis is the most common pg-compatible scenario!
Cluster is secure in all cases.
Threat model
Node-node connections
--insecure
)Note: the vulnerabilities in the 2nd and 3rd column are amplified by sharing the same TCP listener (address/port) between SQL and RPC listeners.
Partial mitigation possible via separate
--sql-addr
and fencing off the node-to-node traffic into a private network.Node-client connections
--insecure
)Note that the first column in the table above assumes that clients also validate server TLS certs! Otherwise the following table applies:
Existing clusters previously run with "insecure mode"
--security-upgrade
to node configs.--security-upgrade
from node configsAdvanced decision flowchart under the fold
Rationale
Background
For context,
--insecure
does the following:Motivation
We know from experience that customers who advocate for "insecure mode" really only care about:
These users certainly do not want to disable authentication and authorization, yet we also know that users do not realize that
--insecure
disables these internal protections inside cockroachdb.Some users also care about point 4 because setting up TLS certs in a web browser is a pain. However since v20.1 we have a solution for those users:
--unencrypted-localhost-http
makes the HTTP endpoint localhost-only without TLS.Strategy
--insecure
deprecatedEpic CRDB-12037
Jira issue: CRDB-3870