QubesOS / qubes-issues

The Qubes OS Project issue tracker
https://www.qubes-os.org/doc/issue-tracking/
533 stars 46 forks source link

Improve entropy collection in VMs #673

Closed marmarek closed 2 years ago

marmarek commented 9 years ago

Reported by joanna on 15 Nov 2012 19:03 UTC While this is only my feeling, I suspect that the entropy collection daemon in our VMs needs some improvements.. This is because of the limited interaction with the physical world of each VM (e.g. mouse events go via vchan instead of via kernel module in a VM).

This can be easily noticed when one tries to generate a new GPG key in a VM -- the gpg would complain about inadequate entropy that is available and will hang until more is produced. One can produce more entropy via various disk activities (e.g. grep through the filesystem), however this: 1) Isn't very convenient 2) It's questionable whether such entropy is of "first-class freshness", or is it somehow inferior to the entropy that could be collected with the help of mouse movements, etc.?

It would probably be desirable to create some entropy producing device that would run in each of the VMs, and to feed this device from Dom0 or other domains exposed to physical hardware (netvm, usbvm?). One should be careful, however, not to distribute the same "entropy bits" to more than one domain, as this would likely compromise domain isolation.

Migrated-From: https://wiki.qubes-os.org/ticket/673

marmarek commented 9 years ago

Comment by joanna on 15 Nov 2012 23:25 UTC Ok, I see two simple solutions:

1) We run a set of daemons in Dom0 (one for each VM) that essentially do this in a loop:

read_a_chunk_of_bytes (/dev/ranomd);
send_bytes_to_VM(); // via qrexec
sleep (...) // let other read some Dom0's entropy also

Then, in the VM, there is a code that reads the transmitted bytes and sends them into the kernel's rng using the RNDADDENTROPY IOCTL on /dev/random:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=blob;f=drivers/char/random.c;h=b86eae9b77dfaeb04dd2d4efefd6ebc01b9e0a93;hb=HEAD#l1265

2) We just enable haveged in each VM (it gathers entopry from measuring internal CPU state):

http://www.issihosts.com/haveged/index.html http://www.irisa.fr/caps/projects/hipsor/

Note 1: haveged is incredibly fast! Just seem to be a bit TOO fast for me... So, I think I would feel better with the option #1 I think...

Note 2: Dom0 entropy seems pretty reasonable (thanks to mouse and keyboard!), so it's not unreasonable to share it among all the VMs. But perhaps we could allow to manually exclude some VMs from getting the entropy from Dom0 (e.g. those that are not very sensitive). E.g. I have almost 30 domains on my laptop, while there are maybe 4 only that are used for key generation and those are the only ones that need fresh entropy from Dom0.

marmarek commented 9 years ago

Comment by joanna on 21 Nov 2012 10:25 UTC Some comments are in this thread:

https://groups.google.com/group/qubes-devel/browse_thread/thread/e7023cca06daa219

marmarek commented 9 years ago

Modified by joanna on 8 Feb 2013 12:59 UTC

marmarek commented 9 years ago

Modified by joanna on 30 Aug 2013 17:21 UTC

marmarek commented 9 years ago

Modified by joanna on 20 Apr 2014 17:04 UTC

marmarek commented 9 years ago

Comment by joanna on 3 Jul 2014 12:07 UTC For now (R2 release) we should just ensure haveged in the default template, I think.

marmarek commented 9 years ago

Comment by marmarek on 4 Jul 2014 02:45 UTC Why this can't be the final solution? I don't believe we ever implement any other solution for this...

Also - do we want to cover by this fix also updates to template (which would mean hard dependency on haveged from qubes-core-vm)? Or installing it in new templates would be enough (so on R2 ISO)?

marmarek commented 9 years ago

Comment by joanna on 4 Jul 2014 09:37 UTC I think just the new template, no need to issue updates. This is no a security problem, rather a usability -- i.e. if read() on /dev/random hangs, it's an annoyance to the user.

I agree to closing this ticket with haveged.

marmarek commented 9 years ago

Comment by marmarek on 4 Jul 2014 10:02 UTC http://git.qubes-os.org/?p=marmarek/linux-template-builder.git;a=commit;h=e416c1a5b3b5a97525881d38efc0eef39659b48c

adrelanos commented 9 years ago

From https://wiki.archlinux.org/index.php/Haveged#Virtual_Machines

As discussed at Is it appropriate to use haveged as a source of entropy on virtual machines?, it can be contested whether haveged provides quality entropy within a virtual environment.


It's not as simple as writing to /dev/random. From man random(4)

This differs from writing to /dev/random or /dev/urandom, which only adds some data but does not increment the entropy count. The following structure is used:


Looks like this could be implemented by reading from /dev/random and forwarding that entropy qrexec, ioctl(2), RNDADDENTROPY to VMs.

marmarek commented 9 years ago

Program attached there returns numbers between 10 and 100 (30-50 on my system), so theoretically it means that VMs have access to real rdtsc. This is on R3rc1 (Xen 4.4.2), not sure how about R2 (Xen 4.1.2). I'll check Xen documentation later - I think I've seen some option for this.

Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?

adrelanos commented 8 years ago

https://github.com/mirage/xentropyd

jpouellet commented 7 years ago

xref: https://groups.google.com/d/topic/qubes-devel/6qAR1lTBzn8/discussion

adrelanos commented 6 years ago

haveged is discouraged in VMs by Andre Seznec, one of haveged's main authors. Source: https://github.com/BetterCrypto/Applied-Crypto-Hardening/commit/cf7cef7a870c1b77089b1bd6209ded6525b5a4e0#commitcomment-23006392

Entropy is needed before systemd / systemd-random-seed.service / haveged by the kernel (and possibly others, which is not researched). (Wrote a bit about that here https://github.com/QubesOS/qubes-issues/issues/2704#issuecomment-363259815.)

Please reopen. @andrewdavidwong

andrewdavidwong commented 6 years ago

Relevant to this issue is the exchange between @rustybird and @adrelanos in #2704 that starts here.

InnovativeInventor commented 6 years ago

I think that it would be good to note that passing random data directly from dom0 to a vm is a bad idea. Linux's entropy pool is not designed to recover from pool compromise (source). This is a problem if we are seeding compromised vms with random data. Hashing/using a one-way function would protect other vms and dom0 if there is a bad vm receiving dom0's random data.

v6ak commented 6 years ago

 Linux's entropy pool is not designed to recover from pool compromise (source)

The linked article mentions that fast recovery is not a requirement.

This is a problem if we are seeding compromised vms with random data.

It isn't, or at least it shouldn't be. Not leaking anything substantial about its internal state is a basic requirement of RNGs. I believe that output from the RNG is already hashed or somehow else processed to ensure this property. There should be no need for additional hashing.

adrelanos commented 6 years ago

Since Qubes is using HVM and Qubes now, could we use VirtIO RNG?

https://wiki.qemu.org/Features/VirtIORNG


(As @HulaHoopWhonix suggested... Reworded by me...)

v6ak commented 6 years ago

Qubes 4 primarily uses PVH, not HVM. HVM requires QEMU, which runs in a separate PV domain. This increases memory/CPU requirements and attack surface (consider HVM->QEMU and PV->dom0 exploit chain). Note that both PV and QEMU are considered to be weak (at least by Qubes team), so Qubes tries to limit impact of vulnerabilities there.

HVM needs to be currently used for PCI-enabled VMs (like sys-net and sys-usb) and VMs where OS requires it (e.g., Windows), because we currently* have no better way to support them.

So no, I don't think virtio can be used with most VMs without decreasing overall security.

*) Once PCI support for PVH (PVHv2?) is implemented, we could get rid of HVM usage for PCI enabled devices. Furthermore, I hope that stubdoms will use PVH one day, which should dramatically reduce the security concern about chained exploitation of QEMU and PV.

marmarek commented 6 years ago

It isn't, or at least it shouldn't be. Not leaking anything substantial about its internal state is a basic requirement of RNGs.

Regardless of actual state here, the current solution (seeding VM's RNG from dom0 as one of boot services) do hash random data extracted in dom0.

adrelanos commented 6 years ago

@HulaHoopWhonix wrote:

jitterentropy-rng should solve this and is a mainline Linux solution that works the same way haveged does

that works the same way haveged does

That is the problem.

The critical quote:

He also pointed out a security warning : with some VMs, the hardware cycles counter is emulated and deterministic, and thus predictable. He therefore does not recommend using HAVEGE on those systems.

...unless we have a good answer to that jitterentropy-rng should not be considered a solution for this issue.

Installing jitterentropy-rng by default may be a good idea anyhow. Created consider installing jitterentropy-rngd to improve entropy collection https://github.com/QubesOS/qubes-issues/issues/4169 for it.

h01ger commented 6 years ago

On Mon, Jul 30, 2018 at 09:31:47PM -0700, Patrick Schleizer wrote:

He also pointed out a security warning : with some VMs, the hardware cycles counter is emulated and deterministic, and thus predictable. He therefore does not recommend using HAVEGE on those systems.

...unless we have a good answer to that jitterentropy-rng should not be considered a solution for this issue.

i dont understand. here ^^^ you wrote it's not a good solution...

Installing jitterentropy-rng by default may be a good idea anyhow. Will create a ticket for it and reference from here.

and ^^^ then you write it's a good idea anyhow. can you please explain?

-- cheers, Holger

adrelanos commented 6 years ago

Holger Levsen:

On Mon, Jul 30, 2018 at 09:31:47PM -0700, Patrick Schleizer wrote:

He also pointed out a security warning : with some VMs, the hardware cycles counter is emulated and deterministic, and thus predictable. He therefore does not recommend using HAVEGE on those systems.

...unless we have a good answer to that jitterentropy-rng should not be considered a solution for this issue.

i dont understand. here ^^^ you wrote it's not a good solution...

Right.

Installing jitterentropy-rng by default may be a good idea anyhow. Will create a ticket for it and reference from here.

and ^^^ then you write it's a good idea anyhow. can you please explain?

Sure. Kernel entropy should never get any worse even if adding completely non-randomness to it (like a continuous stream of zeros). So even if we don't know if entropy gets improved at all, for everyone or just for some users, we can safely take a bet here - i.e. while there are chances installing this package may help, it's not clear it will help (in all cases).

v6ak commented 6 years ago

I got your idea, but I don't agree 100%. There are two potential drawbacks of an additional imperfect RNG:

  1. It might confuse the computer. If (and only if) it increases entropy pool estimate, it can make thighs actually worse, especially at boot.
  2. It might confuse people. Someone might guess the RNG is there for a good reason and there is no reason to add a proper one. While some documentation might help, I see some risk of someone feeling no reason for even looking for the documentation, because this case might look so obvious.

And there is one more point to potentially disagree at: As far as I remember, someone from Qubes core team has somewhere explained that haveged is OK with current Xen config in QubesOS. That was probably by time of R3.0, but I hope it still holds. (Well, I hope it has not changed when replacing PVs by HVMs and PVHv2s) While I haven't made any deeper research on the RNG you have suggested, the fact it reportedly works the same way as haveged makes a hope that it is suitable RNG for QubesOS. -- Sent from my fruity BlackBerry device with K-9 Mail. Please excuse my brevity.

adrelanos commented 6 years ago

@v6ak:

As far as I remember, someone from Qubes core team has somewhere explained that haveged is OK with current Xen config in QubesOS.

I've read all on the subject since this is one of the most interesting and crucial subjects, and can't remember ever having read something like that.

Even if that was the case, that was likely before I posted this:

And before I posted above [1] [2] I've causally talked about this with Joanna in 2015 or so. She wondered how come that haveged can be that was generating entropy. It didn't sound like anyone ever researched RNG.

@v6ak:

I got your idea, but I don't agree 100%. There are two potential drawbacks of an additional imperfect RNG: 1. It might confuse the computer. If (and only if) it increases entropy pool estimate, it can make thighs actually worse, especially at boot.

If I may paraphrase the kernel developers: "We can't prove which entropy source is legit, so we safely mix all together." - Source: https://www.theregister.co.uk/2013/09/10/torvalds_on_rrrand_nsa_gchq/

  1. It might confuse people. Someone might guess the RNG is there for a good reason and there is no reason to add a proper one. While some documentation might help, I see some risk of someone feeling no reason for even looking for the documentation, because this case might look so obvious.

That would be pretty bad auditing skills when it comes to a subject as difficult as entropy.

Following that stance we should no longer install haveged by default. Haveged is in doubt because of https://github.com/QubesOS/qubes-issues/issues/673#issuecomment-363261677, it increases the entropy available counter, and may prevent people from implementing proper solutions such as a virtio-rng equivalent for Xen. (Not my take, I think it's good to keep haveged either way.)

Anyone up to contact the developers of jitterentropy-rng to ask their opinions if it's a solution for Xen/Qubes?

marmarek commented 6 years ago

Following that stance we should no longer install haveged by default.

Please remember we have a service providing entropy from dom0 (plugging into systemd-random-seed), which is started before haveged. As at that point, it's good. Things that start late, like Tor or other crypto-related services shouldn't be affected. The possible problem could be at early kernel boot phase, and very early init services. So, for example ASLR for such services could be weaker. If not something like virtio-rng for xen, then some idea would be to use EFI - AFAIK there is a standard interface for providing random seed. That would require booting linux through EFI (this should be possible for PVH) and provide random seed from dom0 to EFI. Not sure if the latter is easier than implementing it directly in the kernel though...

adrelanos commented 6 years ago

! In T727#16542, @HulaHoopWhonix wrote:

Playing devil's advocate here: Ted Ts'o [0] expresses strong skepticism about the efficacy of RNGs that rely on CPU jitter. summary: CPU jitter may not be random as thought to someone who designed the CPU cache and know how its internals "tick" [1]. So while these RNGs may not harm, another solution for RNG-less platforms may be a good idea.

[0] He's the main developer behind Linux's RNG and staunchly resisted relying only on Intel's RDRAND. His opinions carry weight with good reason.

[1] https://lwn.net/Articles/586427/

It may be that there is some very complex state which is hidden inside the the CPU execution pipeline, the L1 cache, etc., etc. But just because you can't figure it out, and just because I can't figure it out doesn't mean that it is ipso facto something which a really bright NSA analyst working in Fort Meade can't figure out. (Or heck, a really clever Intel engineer who has full visibility into the internal design of an Intel CPU....)

adrelanos commented 6 years ago

! In T727#16541, @HulaHoop wrote: An interesting implementation to work around early boot entropy scarcity with havegedis to include it in the initrd. May be hackish but could be easier for Marmarek than writing something at the EFI level.

https://plus.google.com/+LennartPoetteringTheOneAndOnly/posts/K22yyHRc6hn

+Dustin Kirkland Sorry for not being entirely clear. I don't meant to have haveged running as daemon all the time. But just use haveged --run 1024, which will give you 1MB of random data gathered by the haveg algorithm in a file called sample. THIS file you then use as a last resort to seed the random pool. Wouldn't that be a much nicer option :)? Haveged would even be small enough to embed into the initrd with only 115kB and no library dependencies beside libc (eg. to guarantee that "normal" user space is never executed with a bad seeded random pool).

Not sure about haveged but /usr/lib/qubes/init/qubes-random-seed.sh inside initrd could be interesting.

But also seems hackish / not a clean full solution since it wouldn't work for unmodified images (those without Qubes tools installed) so that would be still inferior to something like virtio-rng for xen.


Related:

adrelanos commented 6 years ago

Let's ask Xen developers. xen-devel mailing list draft here. ask Xen developers about Efficacy of jitterentropy RNG in Xen https://github.com/QubesOS/qubes-issues/issues/4174

adrelanos commented 4 years ago

rndaddentropy - An RNDADDENTROPY ioctl wrapper

$ENTROPY_GENERATOR | rndaddentropy

rndaddentropy is used to pipe entropy directly into Linux's primary entropy pool. This requires superuser privileges.

Adding entropy directly to the primary entropy pool can be very dangerous, a predictable entropy increases the predictability of resulting data from /dev/random and /dev/urandom. Be sure the entropy is generated from a truly random source, and is properly debiased.

v6ak commented 4 years ago

predictable entropy increases the predictability of resulting data from /dev/random and /dev/urandom

Are you sure? As far as I understand, adding entropy can only increase the entropy of the kernel RNG. Especially for /dev/urandom, this should have no negative impact. There are various techniques that are designed for non-decreasing entropy by adding any data. Even simple xor satisfies that. I don't know all the details about the kernel RNG, but I believe they are using something like that. (And maybe something more complex than just xor…)

Also, you can write to /dev/random and /dev/urandom as an unprivileged user. AFAIK, it does the same, except it doesn't increase entropy estimate. If this was dangerous, I would be very surprised…

It could have a negative impact on /dev/random, mostly if you haven't collected enough entropy, because increasing the entropy could make it not to block. This should be rather a theoretical issue after collecting at least 256 bits of entropy.

adrelanos commented 4 years ago

Are you sure?

No. This wasn't written by me. I should have properly formatted my post as quote. Fixed. Did that now. Original author is @rfinnie.

I was just posting this here since RNDADDENTROPY was mentioned earlier. Was doing so without adding my own interpretation of it.


Now my own interpretation:

As far as I understand, adding entropy can only increase the entropy of the kernel RNG. Especially for /dev/urandom, this should have no negative impact.

Also, you can write to /dev/random and /dev/urandom as an unprivileged user. AFAIK, it does the same, except it doesn't increase entropy estimate. If this was dangerous, I would be very surprised…

I agree with that.

Though, quote wikipedia: [0]

In January 2014, Daniel J. Bernstein published a critique of how Linux mixes different sources of entropy. He outlines an attack in which one source of entropy capable of monitoring the other sources of entropy could modify its output to nullify the randomness of the other sources of entropy.

I haven't researched yet if this was fixed since.

It could have a negative impact on /dev/random, mostly if you haven't collected enough entropy, because increasing the entropy could make it not to block.

Agreed.

The following might be dangerous when it is done from inside initrd [1]:

dd if=/dev/zero count=1024 bs=1024 | rndaddentropy

Combined with random.trust_cpu=off kernel boot parameter this is my best guess for worst entropy ever for demonstration purposes. (There might be other kernel boot parameters / compile options / hacks to disable other sources of entropy to make it completely predictable.)

I guess this might be where the rndaddentropy warning is coming from. Nobody might do something as bad as [1] but similarly mess up.

There are two purposes of writing to /dev/random:

For the former, we don't need rndaddentropy but for the latter it might be useful.

The former seems risk free (well... [0]) while the latter has risks under some conditions.

This should be rather a theoretical issue after collecting at least 256 bits of entropy.

After the system is fully booted... Dunno... (dd if=/dev/zero | rndaddentropy - not sure that would work.) A fully predictable stream (of zeros or so) written to /dev/random and entropy counters increased?

Years ago it took a long time to generate a gpg key inside a VM or read from /dev/random. By running cat /dev/random one could see a slow process. It would write 1-2 lines in the shell and then slowly continue. Nowadays thanks to haveged, jitterentropy-rng, virtio-rng the entropy counters are always high and running cat /dev/random will overwhelm the shell. Maybe also the kernel improved since in this area. I find it more likely that this would have been a problem in past but nowadays I don't know.

DemiMarie commented 3 years ago

One approach would be to have a qubes.Random RPC service. This would read from dom0’s /dev/urandom as many bytes as were requested, or write as many bytes as provided. This is secure provided the dom0 kernel CSPRNG is secure.

There are other sources of entropy available as well:

andrewdavidwong commented 3 years ago

Microphone (dom0, not sure if this is a good idea) Cameras (sys-usb)

I think Qubes users are particularly likely to have these disabled.

DemiMarie commented 3 years ago

Indeed they are, although especially cameras can generate quite a bit of high-quality entropy even in darkness.

DemiMarie commented 3 years ago

To elaborate on my previous comment: Even in darkness, a camera will generate entropy due to noise in the image sensor.

I think that a qubes.Random RPC service would be a good idea. It would basically offer read/write access to /dev/urandom in dom0. Writing to /dev/urandom mixes more data into the entropy pool, which can only increase the entropy.

andrewdavidwong commented 3 years ago

To elaborate on my previous comment: Even in darkness, a camera will generate entropy due to noise in the image sensor.

I have my camera assigned to a VM that is usually off, so it doesn't even get to perceive darkness, right? I suspect that I'm not the only one who does this.

brendanhoar commented 3 years ago

To elaborate on my previous comment: Even in darkness, a camera will generate entropy due to noise in the image sensor.

I have my camera assigned to a VM that is usually off, so it doesn't even get to perceive darkness, right? I suspect that I'm not the only one who does this.

Can confirm.

B

DemiMarie commented 3 years ago

To elaborate on my previous comment: Even in darkness, a camera will generate entropy due to noise in the image sensor.

I have my camera assigned to a VM that is usually off, so it doesn't even get to perceive darkness, right? I suspect that I'm not the only one who does this.

My camera is attached to sys-usb, which is often off. What I meant was that if data from the camera is available, there is no reason not to mix some of it into the entropy pool.

adrelanos commented 3 years ago

notes on video-entropyd

3hhh commented 3 years ago

@HulaHoopWhonix wrote:

jitterentropy-rng should solve this and is a mainline Linux solution that works the same way haveged does

that works the same way haveged does

That is the problem.

* [Improve entropy collection in VMs #673 (comment)](https://github.com/QubesOS/qubes-issues/issues/673#issuecomment-363261677)

* [BetterCrypto/Applied-Crypto-Hardening@cf7cef7#commitcomment-23006392](https://github.com/BetterCrypto/Applied-Crypto-Hardening/commit/cf7cef7a870c1b77089b1bd6209ded6525b5a4e0#commitcomment-23006392)

The critical quote:

He also pointed out a security warning : with some VMs, the hardware cycles counter is emulated and deterministic, and thus predictable. He therefore does not recommend using HAVEGE on those systems.

...unless we have a good answer to that jitterentropy-rng should not be considered a solution for this issue.

@adrelanos : If I understand https://github.com/smuellerDD/jitterentropy-rngd/issues/6#issuecomment-483191719 correctly, the author of jitterentropy-rngd states that it uses libc clock_gettime which usually translates to a kernel rdtsc instruction (one could probably disassemble libc for verification or read its code).

So it all boils down to the rdtsc implementation of the hypervisor, which in Xen language is the tsc_mode (tsc = "time stamp counter" = "hardware cycles counter"). Qubes OS appears to use the native mode (cf. xen.xml in dom0): "Guest rdtsc always executed natively (no monotonicity/frequency guarantees) [...]." --> jitterentropy-rngd should use the dom0 time stamp counter and thus do its job properly / help with VM entropy for late init applications (!= early kernel entropy).

Btw if haveged works similarly (I didn't check), it's also fine in native mode. If the time stamp is emulated, it may be predictable as the warning clearly said.

In total it currently looks to me as if there's no argument against one of them to be installed by default in Qubes OS templates. In fact, it is very likely beneficial.

Since haveged uses 3,3M memory on my system and jitterentropy-rng 220K, I vote for the latter. Unfortunately it is currently impossible to get rid of haveged without dropping qubes-vm-recommended entirely. One can only disable the service. I filed #6855 for that.

3hhh commented 3 years ago

I need to partially correct my statement from above:

Practical testing with timeout 10 dd if=/dev/random of=/dev/null bs=1M status=progress generates ~20 MB entropy with haveged inside a VM and 20KB or so with jitterentropy-rngd. So either its debian installation fails at some point, I made a mistake or it simply doesn't do its job. Either way ~2MB/s is pretty terrible (but probably sufficient) as well. Anyway I'll stick with haveged for now...

Moreover the arch wiki has very good information on haveged incl. the aforementioned virtualisation pitfalls and surprisingly it points out that haveged is obsolete [1,2] starting with kernel 5.6 as it essentially went into the kernel. This then also solves early entropy issues.

So sticking with haveged for now and possibly dropping it once 5.6 is considered old seems the reasonable way to go for me atm. EDIT: And stick with native TSC mode to avoid the virtualisation pitfalls of course.

[1] https://github.com/jirka-h/haveged/issues/57#issuecomment-803705461 [2] https://github.com/jirka-h/haveged/commit/297bdf1fc52fc6f59d0495f911d4e594b4d29190

3hhh commented 3 years ago

Btw dom0 has too little entropy. It's probably worth installing haveged there as well.

adrelanos commented 3 years ago

//cc @jirka-h @smuellerDD

jirka-h commented 3 years ago

So sticking with haveged for now and possibly dropping it once 5.6 is considered old seems the reasonable way to go for me atm.

I completely agree with that! Since Kernel 5.6, as soon as Kernel's CRNG is ready, /dev/random does not block on reads anymore [1], so there is no need for any additional service to generate entropy. Moreover, Linux Kernel is using the same technique as haveged as one of the entropy sources.

[1] https://github.com/torvalds/linux/commit/30c08efec8884fb106b8e57094baa51bb4c44e32 [2] https://lore.kernel.org/lkml/alpine.DEB.2.21.1909290010500.2636@nanos.tec.linutronix.de/T/

marmarek commented 3 years ago

Should we add ConditionKernelVersion=<5.6 to the haveged.service unit file?

jirka-h commented 3 years ago

Good point!

I have updated the systemd unit files in haveged contrib directory: https://github.com/jirka-h/haveged/commit/cef1d425b5431847b8c9ab5b00c3e6b82a32b4f2

I recommend doing the same in the downstream version.

smuellerDD commented 3 years ago

@adrelanos asked me for a comment here which I try to give here. If I missed an already discussed point, apologies in advance.

If the point is that getrandom(2) or /dev/random unblocks in a decent time, then yes, using kernel 5.6 and later should solve your problem.

If you are interested in obtaining qualitative entropy with a safety margin, having 2 entropy sources should be considered. So, an rngd like the jitterentropy-rngd or the haveged should be considered. Note that the performance of those are almost irrelevant, you only want a good seed at boot time and eventually during runtime. Note, for one seeding operation, you only need, say, 256 bit or 512 bits from these sources.

If you are concerned about the issues outlined in [2] section 4.4, you may want to definitely use a second entropy source.

If you are interested in an entropy source that follows NIST guidance like SP800-90B, then there is hardly a way around jitterentropy-rngd as neither /dev/random nor havegd provides this compliance.

If you are looking at the "kernel version of havegd", I performed some measurements of it in [2] section 6.3.4.

So, bottom line is that although there is no clear-cut answer, I would use a second entropy source.I cannot say how good the havegd is, all I can say is that the jitterentropy-rngd has been reviewed by multiple sources independent from me. As I have no idea about havegd, I would use the Jitter RNG - but naturally this statement is biased :-)

[1] https://www.bsi.bund.de/EN/Topics/Cryptography/RandomNumberGenerators/random_number_generators_node.html

[2] https://www.bsi.bund.de/SharedDocs/Downloads/EN/BSI/Publications/Studies/LinuxRNG/LinuxRNG_EN_V4_5.pdf

jirka-h commented 3 years ago

Hallo Stephan,

please note that kernel version >= 5.6 never runs out of entropy. I have done tests on kernel 5.13 bare-metal (notebook) and 5.14 with KVM virtualization, and while reading from /dev/random at full speed (37 MiB/s on notebook with kernel 5.13, 200 MiB/s on VM with kernel 5.14), reported entropy was all the time in the range 3700-4036. It means that jitterentropy-rngd adds 64 bytes of entropy only every 10 minutes. Writes on low entropy (bellow ENTROPYTHRESH=1024) [1] are NEVER triggered.

Thus, the entropy refill strategy for kernels >=5.6 has to be rethought - perhaps something like refill entropy every X seconds? But before doing this, I think a broad discussion with the community and kernel developers is needed to confirm that it is the right approach. From my point of view, since the kernel >=5.6 is using a variant of Jitter RNG internally, there is no point in using either jitterentropy-rngd or HAVEGED (but I might be wrong). 

One observation I have made - every time when jitterentropy-rngd has sent entropy to the kernel, the reported entropy level has DECREASED (again, tested on kernels 5.13 and 5.14). See below. I believe that the kernel treats external entropy very carefully and potentially as a danger. It probably means that after external entropy is supplied and entropy level decreased, the kernel will use other entropy sources to refill the pool. In the end effect, it might indeed improve the quality of randomness, but not because of supplied entropy but because other entropy sources are used more extensively...

./jitterentropy-rngd -vvvv | grep "written\|available" | awk '{ print strftime("[%Y-%m-%d %H:%M:%S]"), $0 }'
[2021-08-30 20:32:05] jitterentropy-rngd - Debug: Sufficient entropy 4037 available
[2021-08-30 20:32:05] jitterentropy-rngd - Verbose: 64 bytes written to /dev/random
[2021-08-30 20:32:05] jitterentropy-rngd - Debug: Sufficient entropy 3791 available

Jirka

The commands I have used to test how /dev/random behaves:

In one terminal, read from /dev/random - I'm using pv command to get the speed reported.

pv /dev/random > /dev/null

In another terminal, run jitterentropy-rngd and report available entropy and how much entropy was sent to Kernel's entropy pool.

./jitterentropy-rngd -vvvv | grep "written\|available" | awk '{ print strftime("[%Y-%m-%d %H:%M:%S]"), $0 }'

[1] https://github.com/smuellerDD/jitterentropy-rngd/blob/master/jitterentropy-rngd.c#L105)

3hhh commented 3 years ago

fyia: https://lwn.net/Articles/808575/