Closed cgwalters closed 3 years ago
see also #645
Yes! Thanks, that's the one I was thinking of but for some reason failing to find.
I'm not optimistic that the issue will completely sort itself out; distros won't necessarily enable RANDOM_TRUST_CPU
and platforms may not provide RDRAND access. Allowing Ignition configs to provide system entropy seems like a pretty large footgun though. Many Ignition configs are reused for multiple machines, passed through cloud providers, or passed over unencrypted connections. We can only securely provide entropy if it's generated on demand for each machine and passed to Ignition over HTTPS -- but HTTPS is off-limits because we need entropy to use it.
I'd favor having Ignition privately use RDRAND for TLS entropy, as described in #645, over the approach here. It doesn't work universally though, e.g. on GCE.
When we're talking about TLS, another issue is time; hence the recent creation of roughtime.
OK so the more I think about this, there are two cases:
In the bare metal case, there's no reason we can't take entropy from the running system and provide it to the installed one - this is the default behavior of Anaconda actually, that we need to undo in current c-a.
For the dd
install case, we'd need to mount the target FS and write /var/lib/systemd/random-seed
before rebooting, so it wouldn't quite be dd
anymore.
Now, to the cloud case:
I think basically we need to trust the hypervisor. What's the value of TLS when talking to e.g. the EC2 metadata server? I seriously doubt that traffic has a chance of being intercepted.
And particularly in the qemu case where we pull the config out of read-only data provided directly by the hypervisor...reading an entropy
key wouldn't involve any TLS at all right?
BTW https://github.com/systemd/systemd/pull/4513 is related here too.
And one other thing I was thinking about here is that today, systemd needs some random data for its internal hash tables at least. And that happens in pid 1 I believe even in the initramfs. So that's long before Ignition or systemd-load-random-seed
for that matter. It looks like today it uses GRND_NONBLOCK
but still.
Perhaps what we need to do is move the seed to /boot/random-seed
and have it loaded by GRUB and passed on the kernel cmdline or so.
For the dd install case, we'd need to mount the target FS and write /var/lib/systemd/random-seed before rebooting, so it wouldn't quite be dd anymore.
Ignition needs the entropy when fetching the config in the disks stage, but filesystems aren't mounted until before the files stage. That's surmountable, but it also makes metal different from cloud, which is not ideal. On bare metal I think we can probably assume reasonably current hardware where RDRAND will exist.
I think basically we need to trust the hypervisor. What's the value of TLS when talking to e.g. the EC2 metadata server? I seriously doubt that traffic has a chance of being intercepted.
And particularly in the qemu case where we pull the config out of read-only data provided directly by the hypervisor...reading an entropy key wouldn't involve any TLS at all right?
Either case assumes that the entropy is stored persistently in the instance metadata, but that's not especially secure. For example, on EC2: by default any process on an instance can fetch its userdata, including inside a container, and userdata also available via the EC2 API.
I think https://github.com/coreos/ignition/issues/645#issuecomment-433435477 is probably the right approach, and it's also what systemd does. I'm really not in favor of an Ignition option that's easy to misuse with nigh-undetectable security consequences.
Reopening since I'd still like to consider this. Today in the machine-config-operator we have a "pointer" ignition config which just includes a pair of (CA, real Ignition url). And today, the MCS always dynamically generates that second (pointed-to) config. We're in a position to provide strong entropy to nodes early in the boot process.
I'm really not in favor of an Ignition option that's easy to misuse with nigh-undetectable security consequences.
I think if we document that users shouldn't use it with static configurations, that'd probably be enough. I mean, there's lots of other dangerous things one can do in Ignition with systemd units too.
That said of course, https://lwn.net/Articles/802360/ will eventually make this a lot less bad.
And for sure in the MCO we could instead do something like write /var/lib/systemd/random-seed
via Ignition and then trigger loading it as strong entropy once we're in the real root if we detect it's first boot. But, it seems nicer if Ignition had explicit support for this, as we'd have the entropy loaded before switching root.
I think if we document that users shouldn't use it with static configurations, that'd probably be enough. I mean, there's lots of other dangerous things one can do in Ignition with systemd units too.
With any other Ignition feature, there's a straightforward way to use it which is correct, and the incorrect ways often blow up immediately. This feature can only be used correctly in a narrow set of circumstances, adds non-obvious complications to the cluster's threat model, and is subtly dangerous if used incorrectly.
This issue hasn't seen any traffic for a while, and I still think it's too obscure and dangerous to implement. I'll go ahead and close this out.
I was looking at an OpenShift install recently and noticed in the consoles that
random: crng done
was quite late - ~60s after boot. Now in this case I think it's a bug that the Terraform provider doesn't provide avirtio-rand
device.There was an issue I thought was against Ignition recently but I can't find it about not having entropy before it tries to speak
https://
early in the initramfs.Anyways, here's the proposal; we add a
security/entropy
key that is a string, and Ignition would do RNDADDENTROPY on it.With the OpenShift installer, since the Ignition configs are generated from a client machine at first, and then later by a machine config operator - we're in a position where we can propagate entropy from the client all the way to nodes.
Now personally, I think it's broken for the hypervisor to not provide a random seed. This issue will also sort of solve itself over time as everyone upgrades to hardware with
RDRAND
and people enable the kernel config option to trust it but even in that world, it's not going to hurt to add further additional entropy at system bootstrap time.