jupyter / enhancement-proposals

Enhancement proposals for the Jupyter Ecosystem
https://jupyter.org/enhancement-proposals
BSD 3-Clause "New" or "Revised" License
116 stars 65 forks source link

pre-proposal: CurveZMQ #75

Open minrk opened 3 years ago

minrk commented 3 years ago

zeromq has a transport-level encryption and authentication protocol called CurveZMQ.

I've just landed support for CurveZMQ in IPython Parallel, and I think it's worth talking about in Jupyter.

The gist of the most basic implementation:

In Jupyter, we already have a key distribution mechanism, which is the HMAC message-signing key in connection files. We can use the same key distribution for Curve keys. In the context of Jupyter, it's a little weird, because it's usually the client (KernelManager) that sets the credentials, which means the client issues the kernel's private key as well, and needs to pass the private key to the kernel. This being the case, the absolute simplest version is to use the same private/public keypair for both ends.

Sketch:

Here's an example of an authenticated socket pair in pyzmq:

curve socket example ```python import asyncio import zmq import zmq.asyncio as zaio async def main(): public, private = zmq.curve_keypair() ctx = zaio.Context() server = ctx.socket(zmq.ROUTER) # server socket is a 'curve server' server.CURVE_SECRETKEY = private server.CURVE_PUBLICKEY = public server.CURVE_SERVER = True url = "tcp://127.0.0.1:5555" server.bind(url) no_auth_client = ctx.socket(zmq.DEALER) auth_client = ctx.socket(zmq.DEALER) auth_client.CURVE_SECRETKEY = private auth_client.CURVE_PUBLICKEY = public auth_client.CURVE_SERVERKEY = public # this authenticates the client auth_client.connect(url) no_auth_client.connect(url) for i in range(5): # messages from 'auth_client' will be received asyncio.ensure_future(auth_client.send(b'auth')) # messages from 'no_auth_client' will never be delivered asyncio.ensure_future(no_auth_client.send(b'noauth')) msg = await server.recv_multipart() print("Received", msg) ctx.destroy(linger=0) if __name__ == "__main__": asyncio.run(main()) ```

Benefits

Caveats

Alternatives:

Both of these would require more generic exposure of general options for zmq, whereas the proposal as it is only requires sharing of a single string (or two strings) for the key pair, under the exact same model we already have, and only setting a 3 socket options (same values on all sockets), a fairly minor change in practice. Plus, all changes are in socket creation, nowhere else in the protocol implementation.

minrk commented 3 years ago

we can avoid passing the private key in the connection file by specifying that it will be in an environment variable, e.g. $JUPYTER_CURVE_SECRETKEY. In which case, clients should assume that they will need to issue their own private/public keypair. This doesn't really make any difference to clients, but it means the connection file is no longer enough info for a man-in-the-middle attack. It's also no longer enough info for kernels to set up their sockets.