better entropy handling

cgwalters commented 5 years ago

If you have RDRAND most use cases are going to be fine for entropy, but we can do better for the cases that don't, and we also want to mitigate epic hardware bugs.

Two ideas:

Have the MCS inject entropy into the Ignition served to nodes (including bootstrap :arrow_right: masters)
Have the MCD periodically ask the control plane for some additional entropy (possibly MCD monitors /proc/sys/kernel/random/entropy_avail)

cgwalters commented 5 years ago

At some point soon default kernel jitter entropy will land (in both Fedora/RHEL), which helps mitigate issues here a lot too. But, I think we still should seed strong entropy from the control plane or so.

stevegrubb commented 5 years ago

In OSPP we are requiring rngd which has a userspace implmentation of the jitterentropy source. This is a temporary fix until the jitterentropy source is merged into the kernel.

stevegrubb commented 5 years ago

Also, if there is a need to inject entropy, there is /lib64/libjitterentropy.so which is a user space implementation of the jitterentropy source. Its properties are documented here: https://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html

tomato42 commented 5 years ago

Have the MCS inject entropy into the Ignition served to nodes (including bootstrap arrow_right masters)

that looks like virtio-rng, so yes, injecting entropy into the nodes is a good idea as that will make pool initialisation faster (and more entropy sources won't hurt, so it's good defence in depth)

Have the MCD periodically ask the control plane for some additional entropy (possibly MCD monitors /proc/sys/kernel/random/entropy_avail)

no application should use /dev/random, applications should either use /dev/urandom or getentropy(3), so any correct application will not be affected by "entropy running out" – once an RNG is initialised it stays initialised and secure, it can generate gigabytes upon gigabytes of random bits without compromising security without reseeding

That being said, some actions, like restoring machine memory from disk or machine clone do require injecting new entropy as otherwise they may generate same bits, which in some cases (ECDSA or DSA signatures) would be catastrophic. But to protect against that, the entropy needs to be injected as soon as possible after restart, not periodically.

cgwalters commented 5 years ago

Why re-implement rngd in the MachineConfigServer/Daemon though?

This issue is about having the cluster manage entropy by default in a better way. We can make better defaults for OpenShift/RHCOS because we're a cluster, not a single node. It's not about reimplementing rngd.

To restate the original post here in a different way - even if one has strong local entropy sources, I think it's still a good idea to also mix in entropy from trusted sources like the control plane.

stevegrubb commented 5 years ago

even if one has strong local entropy sources, I think it's still a good idea to also mix in entropy from trusted sources like the control plane.

Indeed, that is a good idea. NIST has a standard for that called entropy as a service. I always thought cloud init would be an ideal use case for it. But mixing in entropy from other sources is fine.

cgwalters commented 5 years ago

I always thought cloud init would be an ideal use case for it.

"CoreOS" here is basically the pairing of (Ignition, OSTree). For us, Ignition takes over the role of cloud-init and Kickstart, providing a uniform way to provision both bare metal and cloud instances.

OpenShift 4 relies heavily on this to keep our installation uniform across metal and cloud.

Hence, if we improve Ignition we're improving both cases. cloud-init is not relevant for CoreOS systems.

cgwalters commented 5 years ago

(The fact that Ignition runs in the initramfs, before any services like sshd-keygen@.service is highly relevant here)

stevegrubb commented 5 years ago

Entropy as a service is described here: https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=920992 This is simply to back up the idea of mixing entropy from other systems is approved cryptography and with some discussion of attacks.

ericavonb commented 5 years ago

Have we gathered any data on the entropy available on nodes? Would be good to know where this issue stands between "purely theoretical", "reasonable security concern", and "causing problems now, beyond crypto".

/aside My puppies would like to volunteer as an excellent "Entropy as a service" provider 😄 . Our CI system would also love to help by providing our test pass rates as a source, but doesn't know if it can be available when needed. 😬

tomato42 commented 5 years ago

Have we gathered any data on the entropy available on nodes? Would be good to know where this issue stands between "purely theoretical", "reasonable security concern", and "causing problems now, beyond crypto".

getrandom() will block until entropy pool is initialised, that's why we need rngd early in boot, to initialise it as soon as possible

so it is causing problems now

cgwalters commented 5 years ago

My puppies would like to volunteer as an excellent "Entropy as a service" provider smile

https://www.cloudflare.com/leagueofentropy/ has a cool web page at least...

Have we gathered any data on the entropy available on nodes?

Prometheus has a metric for it actually.

However, it's a highly platform-specific discussion. In AWS for example, as best I can tell, new hardware has RDRAND, and nowadays you're likely to get that. But...that's far from a guarantee. And AWS also has no official entropy story from either Xen or KVM as best I can tell.

GCP I thought was going to add the same KVM virtio-rand driver, but I'm not seeing it in the coreosci 4.2 cluster. But the hardware has RDRAND.

A current notable big difference between Fedora 30+ and RHEL8.0 at least is that Fedora 30+ has CONFIG_RANDOM_TRUST_CPU=y, but RHEL8.0+ doesn't.

cgwalters commented 5 years ago

Hm, I bet a simple change (as opposed to https://github.com/coreos/ignition/issues/653 ) would be to have the MCS inject /var/lib/systemd/random-seed. That'd probably Just Work as far as getting more entropy early in the real boot without any other changes to the OS, even for starting old RHCOS bootimages.

tomato42 commented 5 years ago

@cgwalters /var/lib/systemd/random-seed because that can be shared between boots (disk cloning), its randomness is not contributed to the pool – it is mixed in, but it it does not increase the gathered entropy for the purpose of pool initialisation

cgwalters commented 5 years ago

because that can be shared between boots (disk cloning),

We don't want people doing that with CoreOS systems. Bigger picture, I think OSTree replaces the "golden image" use cases for disk cloning.

In OpenShift in particular, we have extensive infrastructure for scaling nodes via machine-api - no one should be cloning disks.

I'm tempted to turn on SYSTEMD_RANDOM_SEED_CREDIT in {F,RH}COS.

cgwalters commented 5 years ago

that's why we need rngd early in boot, to initialise it as soon as possible

OK so...to aim to close off the subthread that started this though - we should add rng-tools to RHCOS as a short-term workaround until the RHEL kernel merges jitter entropy? I'm OK with that.

(I believe rng-tools also pulls entropy from the TPM, but that seems like another weird case that should happen in-kernel by default)

I don't think we need to add it to FCOS because Fedora should pretty quickly update to Linux 5.4.

cgwalters commented 5 years ago

https://gitlab.cee.redhat.com/coreos/redhat-coreos/merge_requests/685

cgwalters commented 4 years ago

Looks like RHEL 8.3 will backport jitter entropy: https://bugzilla.redhat.com/show_bug.cgi?id=1778762

cgwalters commented 4 years ago

BTW just a note here I think is useful; before the kernel jitter entropy you'd see messages from this code:

            pr_notice("%s: uninitialized urandom read (%zd bytes read)\n",
                  current->comm, nbytes);

e.g. in RHCOS: [ 1.311072] random: modprobe: uninitialized urandom read (16 bytes read)

But in current FCOS (w/jitter entropy patch) I no longer see those because the code path is basically unreachable AFAICS.

tomato42 commented 4 years ago

because that can be shared between boots (disk cloning),

We don't want people doing that with CoreOS systems. Bigger picture, I think OSTree replaces the "golden image" use cases for disk cloning.

In OpenShift in particular, we have extensive infrastructure for scaling nodes via machine-api - no one should be cloning disks.

disk cloning can happen not only because somebody is cloning the machine, restore from backup, running the system in VMs and restoring snapshots... all will cause the seed file to be reused

cgwalters commented 4 years ago

disk cloning can happen not only because somebody is cloning the machine, restore from backup, running the system in VMs and restoring snapshots... all will cause the seed file to be reused

Eh. Again though I don't think that type of "pet system" stuff is something people should be doing with OpenShift/CoreOS. A huge part of the idea here is that by defining your configs in Ignition/MachineConfig you can restore the system using that.

That said, a random idea here is to encrypt the entropy with the TPM (if available, or much less strong use /var/lib/systemd/$somehardwareid where $somehardwareid might be e.g. the AWS instance identity or for bare metal something based on dmidecode). But at this point this is a systemd discussion and not a MCO one (mostly, except that today systemd doesn't have much "cloud specific" stuff, so logic for handling anything like entropy in a platform-specific mechanism might land in CoreOS or the MCO).

openshift-bot commented 4 years ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot commented 4 years ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

openshift-bot commented 3 years ago

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci-robot commented 3 years ago

@openshift-bot: Closing this issue.

In response to [this](https://github.com/openshift/machine-config-operator/issues/854#issuecomment-740516798): >Rotten issues close after 30d of inactivity. > >Reopen the issue by commenting `/reopen`. >Mark the issue as fresh by commenting `/remove-lifecycle rotten`. >Exclude this issue from closing again by commenting `/lifecycle frozen`. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

openshift / machine-config-operator

better entropy handling #854