mschubert / clustermq

R package to send function calls as jobs on LSF, SGE, Slurm, PBS/Torque, or each via SSH
https://mschubert.github.io/clustermq/
Apache License 2.0
146 stars 27 forks source link

Use ZeroMQ socket authentication #125

Open mschubert opened 5 years ago

mschubert commented 5 years ago

Currently, there is no security mechanism to check if the incoming connections are actually from the submitted jobs, and not from a user that is trying to read out data or send fake results.

This should not be an issue in general because HPC on an internal network is usually trusted, and comparable technologies (e.g. MPI) do not provide either authentication or encryption either in their default implementations.

However, considering that the workers connect to the master and not the other way around, there should probably be a minimal security mechanism that checks if the incoming connections are from the jobs submitted.

This is why there is now a session password passed to the workers as environment variable using CMQ_AUTH={{ auth }}. This token will be sent with every worker message sent to the master, which will check its validity against the filled template value.

This will protect against:

This will not protect against:

Templates without the variable set will continue to work with a warning that sockets are not authenticated.

This mechanism should be exchanged by the ZeroMQ authentication protocol using a Woodhouse pattern (plaintext user/password via socket option). This will require ZeroMQ>4.0.0 and socket options currently not supported by rzmq. We will need to consider a change to pbdZMQ and/or interfacing with the library directly. As this will be a breaking change, it will not happen before the next major release (0.9.x or 1.0).

I currently do not see a reason to implement encryption, as (1) I would consider an HPC environment trusted and (2) this would require a user to set up and manage their keys, complicating the whole process with minimal gain (and performance penalty). Remote connections are managed/authenticated/encrypted via the SSH layer.