Encrypt cluster traffic to limit access through certificate data

davidpelaez commented 8 years ago

This is somewhat related to #227 but not sure it's the same. Before production use makes sense there's gotta be control into who can join the cluster and under what role.

The consul MO https://www.consul.io/docs/agent/encryption.html seems like a nice beginning but I don't know if we could also limit client/server and admin roles using data on the certificates. This is something used in OpenVPN servers for instance where I can ask my client to confirm that the server's presented certificate is indeed that one of a server and not any other signed by the CA.

Finally it would be great if certificate information is distributed on the cluster as a safe way to make client claims. For instance, any machine can claim a datacenter name in the config but how do we know it's true? Evidence of a source of validation to a claim made by a node would be radically useful for high security controls. Also this could be used with custom drivers for highly audited executions where node introspection would enable a distributed, verifiable source of truth to determine if something should be performed or not.

davidpelaez commented 8 years ago

This issue could be of course named with a broader title, I just wanted to use the most descriptive one I could find to begin with.

cbednarski commented 8 years ago

Thanks for the suggestions. There are a handful of issues you identified here, including:

Secure communication
Authenticated Membership
Client identity management
Secure introduction
Auditing

I think Nomad should be able to handle the first two cases. In the latter three cases, though, you will likely need to use a security toolset like Vault or features of your platform (i.e. centralized logging, controlling who can create new instances in your VPC, etc.).

We want to make these things easy in Nomad but we will need to rely on other tools to do some of the heavy lifting.

davidpelaez commented 8 years ago

@cbednarski thanks for summarizing in a much clearer list my presented points. I agree in the benefits of having other tools provide the heavy lifting. However I think the security implications of not including a feature should be carefully considered.

If we don't include something like client identity management (please correct me if I'm wrong) there's not way to establish per node controls into what gets to run on them hence putting the trust in the whole set of nodes as a whole without much space to isolate the effects of one node being compromised. If I have a large nomad cluster with private and public nodes separated in a VPC topology I would like to have some guarantees of the security of the whole system if one node is compromised.

This of course could be a totally new conversation to enhance the considered threat model once secure communication and authenticated membership is handled? They don't see all that disconnected to me but I understand the potential practicality of splitting it.

This also related to other points I shared regarding custom drivers and driver white listing, many isolation controls could be customized at the driver level to ensure the security constraints can be finely controlled by the implementer. With this line of thought instead of isolating the concerns adding @cbednarski first two points and custom drivers as external binaries could unlock extremely flexible security guarantees.

maticmeznar commented 8 years ago

Has there been any progress on encrypting/authenticating cluster traffic? If I understand this correctly, Nomad is currently insecure and therefore should not be used in production?

dadgar commented 8 years ago

@maticmeznar: There has not been work on this yet. The question of whether it can be used in production without these features depends on the security requirements of your firm and whether you have a trusted environment. For many cases, this is not a halter on production but none the less it is an important addition that we look forward to supporting.

ConnorJC3 commented 8 years ago

Last post on this is >month old, is there any progress on this?

dadgar commented 8 years ago

No there hasn't been. There are other higher priority features for Nomad that we have been tackling. We will update this ticket once there is a change.

On Thu, Mar 10, 2016 at 1:12 PM, Connor Catlett notifications@github.com wrote:

Last post on this is >month old, is there any progress on this?

— Reply to this email directly or view it on GitHub https://github.com/hashicorp/nomad/issues/469#issuecomment-195051087.

ConnorJC3 commented 8 years ago

Understood, thanks for the information.

SamMauldin commented 8 years ago

Is there a recommended way to run nomad securely currently on a platform without private networking? (e.g. Digital Ocean)

maticmeznar commented 8 years ago

You could use IPSec.

On 21. 04. 2016 03:59, Sam Mauldin wrote:

Is there a recommended way to run nomad securely currently on a platform without private networking? (e.g. Digital Ocean)

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/hashicorp/nomad/issues/469#issuecomment-212693578

kevincox commented 8 years ago

@maticmeznar The problem with even this is that any user on your machine can access nomad. If I use certificates or passwords that are only readable by root and a service on my servers is compromised they only get one user on one node. Where as if nomad is restricted by network permissions once they compromise any server they can spawn tasks across all my nodes as any user. That's a huge difference in separation.

Gerrrr commented 8 years ago

Hello,

Are you open to a PR on securing RPC communication and authenticating membership?

Do you have a configuration API in mind? My proposal is to use the same configuration parameters as in Consul - verify_incoming, verify_outgoing, verify_server_hostname since they are already a part of tlsutil API.

dadgar commented 8 years ago

@Gerrrr I think the first step would be to collaborate on a design document and then a PR

Gerrrr commented 8 years ago

@dadgar, please have a look at the proposed changes and questions:

RPC

Server encryption:

Introduce new general options in nomad/config and command/agent/config: VerifyIncoming, VerifyOutgoing, CAFile, CertFile, KeyFile
Add corresponding options to command/agent/config_parse.parseServer - verify_incoming, verify_outgoing, ca_file, cert_file, key_file
Replace tls stubs in nomad/server.NewServer with consul/tlsutil tls wrappers

Client encryption:

make nomad.config/tlsConfig available outside of nomad package
Create tls wraper in agent.agent/setupClient and pass it to the client.NewClient.connPool

Questions:

Since there are no legacy reasons in Nomad to introduce VerifyServerHostname, should the hostnames be automatically verified when VerifyOutgoing is enabled?
I would like to suggest server.<datacenter>.<region> format of the certificates' common names. What do you think?

Serf

Introduce Server.encrypt config parameter. Affected files:
- command/agent/config - Add EncryptKey server parameter
- command/agent/config_parse - Add encrypt to parseServer's valid keys
- command/agent/command - Add and verify encrypt command line parameter
Enable EncryptKey in command/agent.setupServer in a similar way to https://github.com/hashicorp/consul/blob/aa1bb5a01236f056e57baa66aad86279d1745c1e/command/agent/agent.go#L384. As a result, EncryptKey should be propagated to nomadConfig.serfConfig.
Add keyring and keygen commands similar to consul ones

Proof of concept: https://github.com/Gerrrr/nomad/tree/feature/serf_encryption

kevincox commented 8 years ago

That looks good to me!

(I didn't audit it but the code looked reasonable)

dadgar commented 8 years ago

@Gerrrr That looks like a great plan actually! Let me know if you are going to be tackling this work as it was something I was going to do rather soon! So we should coordinate

Gerrrr commented 8 years ago

@dadgar @kevincox Thanks for the feedback. I would like to finish the work and submit 2 PRs (for RPC and Serf) soon.

kevincox commented 8 years ago

If you need a tester let me know. I'll try to spin up a cluster.

c4milo commented 7 years ago

should this issue be closed now that Nomad supports TLS certificates and symmetric encryption for gossiping?

dadgar commented 7 years ago

@c4milo Yep! Thanks :)

github-actions[bot] commented 1 year ago

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

hashicorp / nomad

Encrypt cluster traffic to limit access through certificate data #469