Dask security support encrypted private key

leo038 commented 2 years ago

The dask-worker and dask-scheduler support the following three parameters:

--tls-ca-file
--tls-cert
--tls-key

which allows the user to use the Certificates and Private Keys. However, the tls-key cann't be encrypted, if the tls-key file is an encrypted private key, then the passwd should be input when the dask is runing, since dask-worker and dask-scheduler will be invoked many times , it is not possible to input the passwd.

So , dask-woker and dask-sheduler should support another parameter, -tls-passwd, which is the passwd of the -tls-key, in this way , the private key can be encrypted by user, and the passwd can also be encrypted by user.

jcrist commented 2 years ago

This is an interesting request. Can you speak more about why you want to use encrypted private keys for mutual TLS in distributed? In most deployment scenarios I know of, password encrypting the keys wouldn't add any meaningful security value, as the password would still need to be passed in in a way that would be accessible by anyone that had the same level of permissions needed to access the raw certs. There may be a valid use case, but I want to make sure it's worth the effort.

Note that using a CLI flag isn't a great option for specifying a password. The command run for a process is viewable by all users, so passing the password value in on the command line would expose the password.

leo038 commented 2 years ago

Thank you for your reply. In some application scenarios, the private key cannot be stored on the disk in plaintext mode. Therefore, the private key needs to be encrypted, and the encrypted private key can be securely stored on the disk. The passwords used for encryption can be encrypted using other security methods. When TLS is used, the passwords can be restored to plaintext， so the plaintext passwords only appear in the memory, not on the disk.

jcrist commented 2 years ago

In some application scenarios, the private key cannot be stored on the disk in plaintext mode.

Can you speak more about why? What's your threat model here? Why aren't filesystem permissions sufficient?

The passwords used for encryption can be encrypted using other security methods. When TLS is used, the passwords can be restored to plaintext， so the plaintext passwords only appear in the memory, not on the disk.

Sure, but then you still need to distribute the means for decrypting the passwords to decrypt the keys - how can you do this securely in a way that's additive over just using filesystem permissions for the keys? What's the threat model that this solves for?

We can add support for encrypted certs, but in my experience any way to forward the password results in a system that is no more secure than just relying on filesystem permissions. Understanding how you plan to solve for this will help in determining if we want to add this feature, and what mechanism we should support configuring the password.

A (long winded) example of what I mean:

A dask cluster needs to communicate between the scheduler and the workers. You want to ensure no one can snoop on this communication, so you want to enable TLS. You create certs and keys and store them on all nodes. To ensure other users can't read or tamper with the certs, you store all files with 0400 permissions so only you can read them. Everything works great.

Perhaps then you decide your threat model includes an attacker getting filesystem access to the system with your user permissions - either through your login credentials or some other security issue (e.g. a RCE bug in some other component). You want to ensure that a user that has filesystem access with your permission level still can't snoop on the dask comms. So you decide to encrypt the keys - then a user will need access to both the filesystem and the password.

Both the dask scheduler and the workers will need access to the password to decrypt the keys (what this issue is asking for). But how to configure this?

If you use a CLI flag (e.g. --tls-passwd mypassword), then any user with access to the machine can see the password, since CLI args are public for all processes.
If you use an environment variable, any user with the same filesystem permissions as the process can still see it, since /proc/$PID/environ is readable with 0400 permissions.
If you store the password on disk, it'd still need to be accessible by the processes with at least 0400 permissions, so it's still no better than an environment variable

So at least for this threat model, adding a password may seem more secure, but without an alternative means to configure and distribute the password it ends up being just as secure as relying on filesystem permissions alone.

leo038 commented 2 years ago

We want to use an encrypted private key for several reasons：

According to the security requirements of our products, storing the private keys in plaintext is prohibited. In fact, even leaving aside our security requirements, storing private keys in plaintext should be avoided as much as possible because of the obvious insecurity. Any effort on security made on this basis will, if not significantly increase its security, at least not decrease its security.
For password encryption, we can use other more secure encryption methods, such as hardware-based encryption.
For parameter transmission, encrypted passwords are stored on disks and restored to plaintext when used. However, plaintext passwords do not exist in the disk but are temporarily stored in the memory. Of course, this value needs to be passed, but it's all in-memory operations and has a short time of existence, which is obviously safer than plaintext directly on disk.
This may not significantly improve the security of the system, but at least not reduce its security, there should be room for choice.

leo038 commented 2 years ago

Thank you very much for your modification. I tested the new version of the code and found the following issues: 1 this part of the modification was not in the main repository but in a repository https://github.com/jcrist/distributed/tree/tls-password

The tls_client_password parameter is added to distributed.security.Security, but not to dask-scheduler and dask-worker. @jcrist

jcrist commented 2 years ago

Yes, it's in a PR that hasn't been merged yet. It's only exposed via configuration (using either environment variables or yaml files) since command line args are public and insecure for specifying passwords.

dask / distributed

Dask security support encrypted private key #5547