Getting rid of a static password

rvs commented 5 years ago

I've started looking into how to get rid of the ugly static ssh password and came up with this idea. It still needs a bit of polish, but since @eriknordmark wanted to take a look at it -- here it is.

@deitch please let me know if this looks reasonable to you and we can polish it some more

deitch commented 5 years ago

I have a few questions:

It looks like this uses openssl at runtime to extract the ssh-compatible public key from onboard.cert.pem and then saves it to /run/authorized_keys. Since the key to that is shared and already included in every config disk image (and hence every EVE image), what does it do for us any better than having the pre-set password? Is it to tackle the cleanup for creation/generation as well as moving to public keys for now, but punting on the core security issue of a globally shared key until later? I have some ideas for that, but one thing at a time
It will regenerate the key every boot. Is that intentional, e.g. as a way to change a config on a device and have the device automatically have the correct new ssh authorized_keys?
Will the proper directory get mounted into the sshd container? The official sshd mounts /root/.ssh from the host into its container, based on this. We map all of dom0-ztools onto the live OS image's root based on this. However, this PR makes /root/.ssh a symlink to /run/, where the authorized_keys file is. I am not sure what the containerd behaviour is when it is a symlink. Does it follow and mount? Initial tests I just ran say, yes, but worth raising and double-checking.
Can we simplify it at all? The cross-dependencies are:
- pkg/dom0-ztools: symlink from /root/.ssh -> /run
- pkg/zedctr: defined /run:/run as mounted from host in build.yml
- pkg/zedctr: create the authorized_keysin /run, which depends both on it being mounted from /run from the host (which is defined in the same package) and /run being mounted the right place for sshd (which is defined in the dom0-ztools package).

Moving to standard linuxkit/sshd gets rid of the need to generate host keys, which is nice; every little bit of code reduction...

deitch commented 5 years ago

@eriknordmark wrote:

Shouldn't we also change sshd_config to not allow any password-based (root) logins?

The default linuxkit/sshd image already does here

eriknordmark commented 5 years ago

@deitch Since it is a read only filesystem we can't save a generated key or authorized_keys in the container itself; would need to use /config or /persist if that is what we want. @rvs I wonder if there is a case where we should only allow /config/authorized_keys; an EVE user might have no idea who holds the private key related to the onboarding cert. We can refine this part later, but I wanted to tee it up now.

deitch commented 5 years ago

@eriknordmark wrote:

Since it is a read only filesystem we can't save a generated key or authorized_keys in the container itself; would need to use /config or /persist if that is what we want

Ah, so directly using /root/.ssh wouldn't work for that reason? Is that why we write it to /run/, which is the only read-write space?

would need to use /config or /persist if that is what we want

Not sure we do want to persist it, if it is generated. Depends what our flow might be to change public keys. If we change them directly, sure. If we are going to change the public cert, then we might want to regenerate with each boot.

I wonder if there is a case where we should only allow /config/authorized_keys; an EVE user might have no idea who holds the private key related to the onboarding cert. We can refine this part later, but I wanted to tee it up now.

So rather than generate from onboard.cert.pem, have an explicit /config/authorized_keys, and mount that into the sshd container read-only and use directly? It certainly is more straightforward to understand.

deitch commented 5 years ago

Also, have we discussed using a CA structure? I am not exactly madly in love with the whole CA structure, at least as far as the public Internet goes; I find it the weak point in https. However, for our restricted purposes, it might work quite well.

Instead of /config/authorized_keys, we can put in place /config/ca.pub, copy it over on boot (like you did, @rvs, for generation) to somewhere local, and then have the entry in /etc/ssh/sshd_config:

TrustedUserCAKeys /path/to/server_ca.pub

linuxkit/sshd is built on alpine (previously 3.8, lately 3.9), which have, respectively, openssh 7.7 and 7.9, which should support TrustedUserCAKeys.

deitch commented 5 years ago

Confirmed that the version of openssh accepts CA keys. Lots of guides online, but for quick reference, here is a gist

rvs commented 5 years ago

Let me look into this whole TrustedUserCAKeys business a bit more.

rvs commented 5 years ago

Ok, I must say @deitch CA approach got me really thinking. I like it way better than the one I cooked up. But perhaps we can do one better still.

As I said -- my objective is to keep cloud part completely oblivious to this implementation (like it is today). If you recall, today, the only ssh relate API from the cloud we're using is configItem API that is used to set debug.enable.ssh:true/false (the key here is completely opaque to the cloud -- and that's how it should be!).

So... what if we update go-provision implementation to actually recognize not only true/false but also debug.enable.ssh:<valid ssh pub key ID string> ?

This will keep cloud completely out of the loop and also keep the scope to this API under very tight config item.

What do you think @eriknordmark ?

deitch commented 5 years ago

This will keep cloud completely out of the loop and also keep the scope to this API under very tight config item

Will it? Isn't the debug.enable.ssh: <pub key> passed to go-provision from the cloud?

Or do you mean that it isn't static in the cloud, and thus the user-to-cloudcontroller API call (or Web UI) would be something like:

user to cloud controller: enable ssh debug on device 1234aaff1 with ssh public key 12121212bbdda6

I do like it. It eliminates any hard-coding and saving of private keys anywhere, dynamically loading them as needed.

I would point out, though, that almost certainly (and rather quickly) we will receive requests to enable saving in the cloud controller of public keys, similar to every public cloud provider (aws, azure, packet, etc.), where they store the public keys.

But either way, that can be implemented later.

eriknordmark commented 5 years ago

@rvs My idea a week ago or so was to pass an authorized key using a config item. Whether we do by redefining debug.enable.ssh from a boolean to the string with a key requires some care to not mess up the devices we have in the lab. I'd suggest defining a new config item and then later retiring the boolean.

For us and an EVE developer, is it easier to use the CA key approach vs. just specifying the key on my laptop? I think for an EVE developer the CA approach seems overkill.

If Zededa engineers need to debug devices in production then a single authorized key should suffice as well. For our lab setup with alpha/hummingbird a CA key might make sense, but that requires managing our ssh keys that way. And I'd really want 95%+ of our work there to use the logs and TBD APIs to extract state out of the device, and depend a lot less on ssh.

rvs commented 5 years ago

@rvs My idea a week ago or so was to pass an authorized key using a config item. Whether we do by redefining debug.enable.ssh from a boolean to the string with a key requires some care to not mess up the devices we have in the lab. I'd suggest defining a new config item and then later retiring the boolean.

In that case I need to apologize @eriknordmark -- it seems that I have misunderstood you. I thought you were talking about zededa cloud actually being aware of the key. If it is just the opaque config item -- it is exactly the same as what I proposed in this thread.

At this point, I feel that the choice is between picking CA vs. authorized_keys approaches. Let me come up with a bit more background on either of these before we commit to a final solution.

rvs commented 5 years ago

@deitch @eriknordmark I suggest we close this PR and move both this part of it (and corresponding change to go code) to eve. I've moved this part of it to https://github.com/zededa/eve/pull/23 -- please take a look.

@deitch while moving it I realized that we actually lost an ability to ssh directly into the zedctr/pillar container with this patch. Do you think yo can help coming up with some clever trick for setting a shell for root so that it immediately does

ctr --namespace services.linuxkit t exec -t --exec-id XXX  pillar sh

?

deitch commented 5 years ago

Let's move this conversation over. I am going to pull the comments across as well, so we can close this one out.

deitch commented 5 years ago

Comments (short form) moved over. @rvs close this one out.

project-eve / zenbuild

Getting rid of a static password #90