Closed cgwalters closed 3 years ago
At some point soon default kernel jitter entropy will land (in both Fedora/RHEL), which helps mitigate issues here a lot too. But, I think we still should seed strong entropy from the control plane or so.
In OSPP we are requiring rngd which has a userspace implmentation of the jitterentropy source. This is a temporary fix until the jitterentropy source is merged into the kernel.
Also, if there is a need to inject entropy, there is /lib64/libjitterentropy.so which is a user space implementation of the jitterentropy source. Its properties are documented here: https://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html
Have the MCS inject entropy into the Ignition served to nodes (including bootstrap arrow_right masters)
that looks like virtio-rng, so yes, injecting entropy into the nodes is a good idea as that will make pool initialisation faster (and more entropy sources won't hurt, so it's good defence in depth)
Have the MCD periodically ask the control plane for some additional entropy (possibly MCD monitors /proc/sys/kernel/random/entropy_avail)
no application should use /dev/random
, applications should either use /dev/urandom
or getentropy(3)
, so any correct application will not be affected by "entropy running out" – once an RNG is initialised it stays initialised and secure, it can generate gigabytes upon gigabytes of random bits without compromising security without reseeding
That being said, some actions, like restoring machine memory from disk or machine clone do require injecting new entropy as otherwise they may generate same bits, which in some cases (ECDSA or DSA signatures) would be catastrophic. But to protect against that, the entropy needs to be injected as soon as possible after restart, not periodically.
Why re-implement rngd in the MachineConfigServer/Daemon though?
This issue is about having the cluster manage entropy by default in a better way. We can make better defaults for OpenShift/RHCOS because we're a cluster, not a single node. It's not about reimplementing rngd.
To restate the original post here in a different way - even if one has strong local entropy sources, I think it's still a good idea to also mix in entropy from trusted sources like the control plane.
even if one has strong local entropy sources, I think it's still a good idea to also mix in entropy from trusted sources like the control plane.
Indeed, that is a good idea. NIST has a standard for that called entropy as a service. I always thought cloud init would be an ideal use case for it. But mixing in entropy from other sources is fine.
I always thought cloud init would be an ideal use case for it.
"CoreOS" here is basically the pairing of (Ignition, OSTree). For us, Ignition takes over the role of cloud-init and Kickstart, providing a uniform way to provision both bare metal and cloud instances.
OpenShift 4 relies heavily on this to keep our installation uniform across metal and cloud.
Hence, if we improve Ignition we're improving both cases. cloud-init is not relevant for CoreOS systems.
(The fact that Ignition runs in the initramfs, before any services like sshd-keygen@.service
is highly relevant here)
Entropy as a service is described here: https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=920992 This is simply to back up the idea of mixing entropy from other systems is approved cryptography and with some discussion of attacks.
Have we gathered any data on the entropy available on nodes? Would be good to know where this issue stands between "purely theoretical", "reasonable security concern", and "causing problems now, beyond crypto".
/aside My puppies would like to volunteer as an excellent "Entropy as a service" provider 😄 . Our CI system would also love to help by providing our test pass rates as a source, but doesn't know if it can be available when needed. 😬
Have we gathered any data on the entropy available on nodes? Would be good to know where this issue stands between "purely theoretical", "reasonable security concern", and "causing problems now, beyond crypto".
getrandom()
will block until entropy pool is initialised, that's why we need rngd
early in boot, to initialise it as soon as possible
so it is causing problems now
My puppies would like to volunteer as an excellent "Entropy as a service" provider smile
https://www.cloudflare.com/leagueofentropy/ has a cool web page at least...
Have we gathered any data on the entropy available on nodes?
Prometheus has a metric for it actually.
However, it's a highly platform-specific discussion. In AWS for example, as best I can tell, new hardware has RDRAND, and nowadays you're likely to get that. But...that's far from a guarantee. And AWS also has no official entropy story from either Xen or KVM as best I can tell.
GCP I thought was going to add the same KVM virtio-rand driver, but I'm not seeing it in the coreosci 4.2 cluster. But the hardware has RDRAND.
A current notable big difference between Fedora 30+ and RHEL8.0 at least is that Fedora 30+ has CONFIG_RANDOM_TRUST_CPU=y
, but RHEL8.0+ doesn't.
Hm, I bet a simple change (as opposed to https://github.com/coreos/ignition/issues/653 ) would be to have the MCS inject /var/lib/systemd/random-seed
. That'd probably Just Work as far as getting more entropy early in the real boot without any other changes to the OS, even for starting old RHCOS bootimages.
@cgwalters /var/lib/systemd/random-seed
because that can be shared between boots (disk cloning), its randomness is not contributed to the pool – it is mixed in, but it it does not increase the gathered entropy for the purpose of pool initialisation
because that can be shared between boots (disk cloning),
We don't want people doing that with CoreOS systems. Bigger picture, I think OSTree replaces the "golden image" use cases for disk cloning.
In OpenShift in particular, we have extensive infrastructure for scaling nodes via machine-api - no one should be cloning disks.
I'm tempted to turn on SYSTEMD_RANDOM_SEED_CREDIT in {F,RH}COS.
that's why we need rngd early in boot, to initialise it as soon as possible
OK so...to aim to close off the subthread that started this though - we should add rng-tools
to RHCOS as a short-term workaround until the RHEL kernel merges jitter entropy? I'm OK with that.
(I believe rng-tools also pulls entropy from the TPM, but that seems like another weird case that should happen in-kernel by default)
I don't think we need to add it to FCOS because Fedora should pretty quickly update to Linux 5.4.
Looks like RHEL 8.3 will backport jitter entropy: https://bugzilla.redhat.com/show_bug.cgi?id=1778762
BTW just a note here I think is useful; before the kernel jitter entropy you'd see messages from this code:
pr_notice("%s: uninitialized urandom read (%zd bytes read)\n",
current->comm, nbytes);
e.g. in RHCOS:
[ 1.311072] random: modprobe: uninitialized urandom read (16 bytes read)
But in current FCOS (w/jitter entropy patch) I no longer see those because the code path is basically unreachable AFAICS.
because that can be shared between boots (disk cloning),
We don't want people doing that with CoreOS systems. Bigger picture, I think OSTree replaces the "golden image" use cases for disk cloning.
In OpenShift in particular, we have extensive infrastructure for scaling nodes via machine-api - no one should be cloning disks.
disk cloning can happen not only because somebody is cloning the machine, restore from backup, running the system in VMs and restoring snapshots... all will cause the seed file to be reused
disk cloning can happen not only because somebody is cloning the machine, restore from backup, running the system in VMs and restoring snapshots... all will cause the seed file to be reused
Eh. Again though I don't think that type of "pet system" stuff is something people should be doing with OpenShift/CoreOS. A huge part of the idea here is that by defining your configs in Ignition/MachineConfig you can restore the system using that.
That said, a random idea here is to encrypt the entropy with the TPM (if available, or much less strong use /var/lib/systemd/$somehardwareid
where $somehardwareid
might be e.g. the AWS instance identity or for bare metal something based on dmidecode
). But at this point this is a systemd discussion and not a MCO one (mostly, except that today systemd doesn't have much "cloud specific" stuff, so logic for handling anything like entropy in a platform-specific mechanism might land in CoreOS or the MCO).
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle rotten /remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting /reopen
.
Mark the issue as fresh by commenting /remove-lifecycle rotten
.
Exclude this issue from closing again by commenting /lifecycle frozen
.
/close
@openshift-bot: Closing this issue.
If you have RDRAND most use cases are going to be fine for entropy, but we can do better for the cases that don't, and we also want to mitigate epic hardware bugs.
Two ideas:
/proc/sys/kernel/random/entropy_avail
)