Closed klausenbusk closed 2 years ago
Hi Kristian,
thanks a lot for pointing me to the recent kernel development.
After reading the LKML/LWN articles, I completely agree that the haveged service is now obsolete (starting from kernel 5.6). I have verified it experimentally on Fedora 32, running kernel 5.10
$time timeout 30 pv /dev/random > /dev/null
1.18GiB 0:00:29 [42.2MiB/s]
real 0m30.012s
user 0m0.022s
sys 0m29.934s
1) There is no difference in throughput with and without haveged service running 2) haveged service is not triggered at all as verified with strace. No entropy is sent to the kernel.
I'm happy that these changes made it into the mainline kernel. It's nice to see that the main idea behind HAVEGED has sustained time test! (It was published already in 2003 here: https://www.irisa.fr/caps/projects/hipsor/publications/havege-tomacs.pdf)
I'm also glad that the HAVEGE algorithm is being further explored and examined - see the "CPU Jitter Random Number Generator" page at https://www.chronox.de/jent.html
I will keep maintaining HAVEGED - most Linux installations are still running on the older kernel versions. HAVEGED can also be used as the userspace RNG to generate random numbers. See man -S8 haveged
for examples or try running haveged -n 0 | pv > /dev/null
Last but not least, HAVEGED can be used as the RNG library.
Thanks a lot Jirka
Further references: https://lore.kernel.org/lkml/alpine.DEB.2.21.1909290010500.2636@nanos.tec.linutronix.de/T/ https://lwn.net/Articles/808575/ https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.6-Random-Rework https://en.wikipedia.org/wiki//dev/random https://www.irisa.fr/caps/projects/hipsor/publications/havege-tomacs.pdf https://www.chronox.de/jent.html https://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.pdf https://github.com/sandy-harris/maxwell
Relevant info has been added in 297bdf1fc52fc6f59d0495f911d4e594b4d29190, also cef1d425b5431847b8c9ab5b00c3e6b82a32b4f2 adds a condition on kernel version in service file and makes it a no-op for linux >= 5.6.
So now with haveged deemed "obsolete" done and getting dropped all over the place because it doesn't do anything, how does your average joe linux hobbyist admin deal with crng init taking ages? That's still happening in recent kernels... and was the prime reason I installed haveged pretty much everywhere, only it's a no-op now.
Bare metal (archlinux kernel 5.15):
dmesg | grep random
[ 0.287693] random: get_random_u64 called from __kmem_cache_create+0x2a/0x540 with crng_init=0
[ 6.158634] random: fast init done
[ 100.703821] random: crng init done
In a KVM virtual machine (archlinux kernel 5.10):
[ 0.063991] random: get_random_u64 called from __kmem_cache_create+0x2a/0x4c0 with crng_init=0
[ 1.558234] random: fast init done
[ 3.958456] random: cryptsetup: uninitialized urandom read (4 bytes read)
[ 36.354001] random: cryptsetup: uninitialized urandom read (32 bytes read)
[ 41.265375] random: cryptsetup: uninitialized urandom read (64 bytes read)
[ 41.265477] random: cryptsetup: uninitialized urandom read (64 bytes read)
[ 41.265481] random: cryptsetup: uninitialized urandom read (64 bytes read)
[ 50.726303] random: crng init done
[ 50.726321] random: 5 urandom warning(s) missed due to ratelimiting
Same virtual machine with manually tickling the random device early in initramfs:
dmesg | grep random
[ 0.046587] random: get_random_u64 called from __kmem_cache_create+0x2a/0x4c0 with crng_init=0
[ 1.377856] random: fast init done
[ 2.696946] random: crng init done
So massaging the random device early still has an influence, and I'd love to continue using haveged for the job, but for that to work it would actually have to do something, like unconditionally feed some randomness on startup, either as a once-off or periodically... as far as random sources go, it's the more the merrier, isn't it?
Hi Andreas,
I have updated haveged to feed entropy to the kernel on start and then every 60 seconds. Please give it a try and let me know if it works for you. See commits b0d1b0e82602401c51404b941133b993b8aa65e9 and c35c6f44aa01d0f6ddf2752e04b5ef763f4c61a2
I have tested it on x86_64 running kernel 5.15 (./haveged --Foreground) and it works fine there.
Thanks a lot Jirka
Now does it make sense to revert cef1d425b5431847b8c9ab5b00c3e6b82a32b4f2?
It seems to work fine for me.
Without haveged:
[ 0.074495] random: get_random_u64 called from __kmem_cache_create+0x2a/0x4c0 with crng_init=0
[ 1.353786] random: fast init done
[ 3.428981] random: cryptsetup: uninitialized urandom read (4 bytes read)
[ 27.874776] random: cryptsetup: uninitialized urandom read (32 bytes read)
[ 32.867232] random: cryptsetup: uninitialized urandom read (64 bytes read)
[ 32.867418] random: cryptsetup: uninitialized urandom read (64 bytes read)
[ 32.867425] random: cryptsetup: uninitialized urandom read (64 bytes read)
[ 42.252014] random: crng init done
[ 42.252027] random: 5 urandom warning(s) missed due to ratelimiting
With haveged:
[ 0.066190] random: get_random_u64 called from __kmem_cache_create+0x2a/0x4c0 with crng_init=0
[ 1.476520] random: crng init done
(random fast init message is mysteriously missing. haveged itself spawns around the [ 0.8xxx] ~ [ 0.9xxx] mark.)
Using a simple initcpio hook (archlinux specific):
/etc/initcpio/install/haveged
#!/bin/bash
build() {
add_binary "haveged"
add_runscript
}
help() {
cat <<HELPEOF
Haveged for early randomness and fast crng initialization.
HELPEOF
}
/etc/initcpio/hooks/haveged
#!/usr/bin/ash
run_earlyhook() {
haveged
}
run_cleanuphook() {
killall haveged
}
So nothing fancy, it simply starts haveged early initramfs and kills it late initramfs. haveged works its magic in the meantime. haveged as a service to be spawned again later on by the real init system.
(random fast init message is mysteriously missing. haveged itself spawns around the [ 0.8xxx] ~ [ 0.9xxx] mark.)
It is not mysteriously. As enough entropy is available fast init
is skipped completely in favor of complete crng init
.
Using a simple initcpio hook (archlinux specific):
[snipped initcpio hooks]
So nothing fancy, it simply starts haveged early initramfs and kills it late initramfs. haveged works its magic in the meantime.
I could add something like this in the package....
But wondering if it makes sense to add a new switch --once
or --early
. It could inject entropy once, then terminate itself.
haveged as a service to be spawned again later on by the real init system.
With the switch from above it would be possible to keep the current service haveged.service
as is, including the condition on kernel version.
Adding a new service haveged-once.service
using the new switch would allow to add a service in systemd-enabled initramfs image.
I'm a bit puzzled why the jitter entropy in the kernel (merge commit, commit, LKML) isn't working. Is there anything exotic about your setup @frostschutz?
Edit: Is it blocking boot @frostschutz?
it would be possible to keep the current service
haveged.service
as is, including the condition on kernel version.
:+1: I like it!
I have added a new switch --once
and I have kept haveged.service
unchanged. Could you please give it a try? If it works fine, I will release a new version.
Thanks a lot! Jirka
commit 98ead65f953a3431d53c5837eedd008100ce9ed7
Is there anything exotic about your setup @frostschutz?
My desktop is a standard arch linux install with a custom encryption hook since I have more than just the one LUKS device. But in the end it still runs a standard cryptsetup open ...
and waits for me to enter passphrase. And that works fine, except crng init simply never happens until there is actually activity from my end, so the crng init after 100 seconds on bare metal is because I wasn't typing anything.
My virtual server is a little exotic in that its encrypted and uses cryptsetup very early to check or change the passphrase. So it actually wants to use the random device very early, hence the warnings about cryptsetup reading random before crng is fully initialized.
It is not blocking in either case, so maybe this is for cosmetics only, it just doesn't give me a good feeling to see such warning messages or late initializations.
The kernel is very conservative about randomness/entropy (for good reasons, I'm sure) but even if the kernel 'fixes' it, I still want to keep using haveged... I also have other things in place like an early random seed (the systemd random seed service runs long after initramfs is done so maybe a little late) but I did not use those in the above tests.
Basically however good the kernels random implementation is or will be, I still feel that userspace should mess with it just a little regardless, and for that purpose I'd love to continue using haveged, both in initramfs as well as a service that just keeps running indefinitely.
So this is my personal feeling but I'd still love the kernel 5.6 condition to go away, after all I installed the thing and enabled the service because I want it to run and do something. ;-) I'm sure there will be people who don't need it but they simply won't install or activate it either?
I can patch the service file locally or just install my own, but I can't patch haveged itself, so @jirka-h thanks a lot for doing that especially between the years — I really appreciate it.
@frostschutz, do you have initramfs with or without systemd
?
I don't use systemd in initramfs yet. Traditional busybox-based initcpio for me. Otherwise, the hook I posted above also would not work.
Ah, got it. Missed it was your post. 🙈
Could you please give it a try? If it works fine, I will release a new version.
Would be nice to have haveged-once.service
included in the release...
Ok, this is my log now... Does not look successful, though haveged
has been started. Did it feed anything at all?
Dez 31 16:41:38 archlinux kernel: random: get_random_u64 called from __kmem_cache_create+0x2a/0x540 with crng_init=0
Dez 31 16:41:38 archlinux systemd[1]: Initializing machine ID from random generator.
Dez 31 16:41:38 archlinux haveged[148]: haveged starting up
Dez 31 16:41:38 archlinux haveged[148]: haveged: command socket is listening at fd 3
Dez 31 16:41:38 archlinux haveged[160]: haveged: ver: 1.9.16; arch: x86; vend: GenuineIntel; build: (gcc 11.1.0 ITV); collect: 128K
Dez 31 16:41:38 archlinux haveged[160]: haveged: cpu: (L4 VC); data: 32K (L4 V); inst: 32K (L4 V); idx: 23/40; sz: 31288/55167
Dez 31 16:41:38 archlinux haveged[160]: haveged: tot tests(BA8): A:1/1 B:1/1 continuous tests(B): last entropy estimate 7.99948
Dez 31 16:41:38 archlinux haveged[160]: haveged: fills: 0, generated: 0
Dez 31 16:41:38 archlinux haveged[160]: haveged: Stopping due to signal 15
Dez 31 16:41:38 archlinux systemd[1]: haveged-once.service: Deactivated successfully.
Dez 31 16:41:39 archlinux kernel: random: fast init done
[...]
Used this for haveged-once.service
(derived from contrib/Fedora/haveged.service
):
[Unit]
Description=Entropy Daemon based on the HAVEGE algorithm
Documentation=man:haveged(8) http://www.issihosts.com/haveged/
DefaultDependencies=no
[Service]
Type=oneshot
ExecStart=@SBIN_DIR@/haveged -w 1024 -v 1 --once
SuccessExitStatus=137 143
SecureBits=noroot-locked
CapabilityBoundingSet=CAP_SYS_ADMIN CAP_SYS_CHROOT
# We can *not* set PrivateTmp=true as it can cause an ordering cycle.
PrivateTmp=false
PrivateDevices=true
# We can *not* set PrivateNetwork=true to allow command mode (chroot when included in initramfs)
#PrivateNetwork=true
ProtectSystem=full
ProtectHome=true
ProtectHostname=true
ProtectKernelLogs=true
ProtectKernelModules=true
RestrictNamespaces=true
RestrictRealtime=true
LockPersonality=true
MemoryDenyWriteExecute=true
SystemCallArchitectures=native
SystemCallFilter=@system-service
SystemCallFilter=~@mount
SystemCallErrorNumber=EPERM
Possibly we could drop even more settings for initramfs... Not sure.
BTW, systemd
reports:
systemd[1]: /usr/lib/systemd/system/haveged.service:32: Failed to parse system call, ignoring: newuname
... so I dropped it from my service.
@frostschutz, can you please test this package? haveged-1.9.15-2
Should contain everything needed...
Thanks a lot for the testing and packaging!
I have done couple of changes:
ROOT$./haveged -w 1024 -v 1 --Foreground --once
haveged: command socket is listening at fd 3
haveged starting up
haveged: ver: 1.9.16; arch: x86; vend: GenuineIntel; build: (gcc 11.2.1 ITV); collect: 128K
haveged: cpu: (L4 VC); data: 32K (L4 V); inst: 32K (L4 V); idx: 24/40; sz: 32010/53875
haveged: tot tests(BA8): A:1/1 B:1/1 continuous tests(B): last entropy estimate 7.99914
haveged: fills: 0, generated: 0
haveged: Entropy refilled once (2048 bytes), exiting.
tot tests(BA8): A:1/1 B:1/1 continuous tests(B): last entropy estimate 7.99914
fills: 1, generated: 512 K bytes, RNDADDENTROPY: 2 K bytes
contrib/Fedora/haveged-once.service
to GIT repo (1f6a41a112dc3a52792f8d981f0812c7bed0d5db) I took your version, but I have added --Foreground
Could you please test the latest version?
Thanks a lot! Jirka
Thanks for the changes!
Away from keyboard till next year. 😳😆 I will test tomorrow.
@frostschutz, can you please test this package? haveged-1.9.15-2
Tested it and the (non-systemd) hook works for me, haven't tested any of the systemd stuff though.
Tested it and the (non-systemd) hook works for me, haven't tested any of the systemd stuff though.
Thanks a lot! (I do test the other part.)
This is with current git master (1f6a41a112dc3a52792f8d981f0812c7bed0d5db):
Jan 01 22:30:38 archlinux kernel: random: get_random_u64 called from __kmem_cache_create+0x2a/0x540 with crng_init=0
Jan 01 22:30:38 archlinux systemd[1]: Initializing machine ID from random generator.
Jan 01 22:30:38 archlinux haveged[147]: haveged: command socket is listening at fd 3
Jan 01 22:30:38 archlinux haveged[147]: haveged: ver: 1.9.16; arch: x86; vend: GenuineIntel; build: (gcc 11.1.0 ITV); collect: 128K
Jan 01 22:30:38 archlinux haveged[147]: haveged: cpu: (L4 VC); data: 32K (L4 V); inst: 32K (L4 V); idx: 23/40; sz: 31288/55167
Jan 01 22:30:38 archlinux haveged[147]: haveged: tot tests(BA8): A:1/1 B:1/1 continuous tests(B): last entropy estimate 8.00103
Jan 01 22:30:38 archlinux haveged[147]: haveged: fills: 0, generated: 0
Jan 01 22:30:38 archlinux haveged[147]: haveged: Entropy refilled once (2048 bytes), exiting.
Jan 01 22:30:38 archlinux haveged[147]: tot tests(BA8): A:1/1 B:1/1 continuous tests(B): last entropy estimate 8.00103
Jan 01 22:30:38 archlinux haveged[147]: fills: 1, generated: 512 K bytes, RNDADDENTROPY: 2 K bytes
Jan 01 22:30:38 archlinux haveged[147]: haveged starting up
Jan 01 22:30:38 archlinux systemd[1]: haveged-once.service: Main process exited, code=exited, status=1/FAILURE
Jan 01 22:30:38 archlinux systemd[1]: haveged-once.service: Failed with result 'exit-code'.
Jan 01 22:30:38 archlinux kernel: random: crng init done
It does work, but haveged
returns with exit code indicating error.
I think I did not notice before because I dropped --Foreground
from my service file. Perhaps --once
should not fork at all.
Oh, and the man page is missing --once
.
Thanks for the testing - good catch!
I have fixed the exit status when using --once
and updated the man page. See 9e4a1f53cfbb2a7aa1f534861e6f03587e5d6f16
Could you please verify the fix?
Looks good now, thanks!
This think this could be used in haveged-dracut.module
now, which would allow to drop haveged-switch-root.service
.
My desktop is a standard arch linux install with a custom encryption hook since I have more than just the one LUKS device. But in the end it still runs a standard
cryptsetup open ...
and waits for me to enter passphrase. And that works fine, except crng init simply never happens until there is actually activity from my end, so the crng init after 100 seconds on bare metal is because I wasn't typing anything.
The kernel's jitter entropy is only running if needed, so the crng initializing very late in this case is expected:
/*
* Wait for the urandom pool to be seeded and thus guaranteed to supply
* cryptographically secure random numbers. This applies to: the /dev/urandom
* device, the get_random_bytes function, and the get_random_{u32,u64,int,long}
* family of functions. Using any of these functions without first calling
* this function forfeits the guarantee of security.
*
* Returns: 0 if the urandom pool has been seeded.
* -ERESTARTSYS if the function was interrupted by a signal.
*/
int wait_for_random_bytes(void)
My virtual server is a little exotic in that its encrypted and uses cryptsetup very early to check or change the passphrase. So it actually wants to use the random device very early, hence the warnings about cryptsetup reading random before crng is fully initialized.
That is expected, as cryptsetup
on arch uses /dev/urandom
by default, which won't trigger the kernel's jitter entropy. If you switch to /dev/random
(--use-random
) it will trigger the kernel's jitter entropy and block until the crng is fully initialized (bascially the only difference between random
and urandom
these days: commit, LWN) and I think it is a more sane choice for your use-case.
It is not blocking in either case, so maybe this is for cosmetics only, it just doesn't give me a good feeling to see such warning messages or late initializations.
It sounds like mostly a cosmetic thing to me :) The warning caused by cryptsetup
should indeed be fixed (ex: by using /dev/random
) or you could trigger the kernel's jitter entropy from a initramfs script (ex: head -c16 /dev/random > /dev/null
).
Basically however good the kernels random implementation is or will be, I still feel that userspace should mess with it just a little regardless, and for that purpose I'd love to continue using haveged, both in initramfs as well as a service that just keeps running indefinitely.
So this is my personal feeling but I'd still love the kernel 5.6 condition to go away, after all I installed the thing and enabled the service because I want it to run and do something. ;-) I'm sure there will be people who don't need it but they simply won't install or activate it either?
Make sense, I just don't like users installing haveged
unnecessary because they read some 10 years old guide, so IMO haveged
still isn't needed on Linux =>5.6 for the average user.
Make sense, I just don't like users installing haveged unnecessary because they read some 10 years old guide, so IMO haveged still isn't needed on Linux =>5.6 for the average user.
:+1: I completely agree!
Thanks for the testing! I have finalized the changes and released v1.9.16
The kernel's jitter entropy is only running if needed
uses
/dev/urandom
by default, which won't trigger the kernel's jitter entropy.bascially the only difference between
random
andurandom
these days
That's very unfortunate.
It means throwing nearly 10 years worth of re-education out the window ( Just use /dev/urandom! ).
I really don't want to use /dev/random
anymore. Documentation states it blocks and has indeterminate delays, which is unacceptable when early booting (you want it to boot, not hang indefinitely). It also states that /dev/urandom
is preferred and sufficient in all use cases (with the exception of early booting which just means the kernel leaves you hanging there, go figure).
If the random jitter works, then it seems to me like the kernel should be using it unconditionally, early, and for both random and urandom reads equally. High time to put the final nail in the coffin for the /dev/random legacy device. It's unfathomable to me why reading /dev/random over /dev/urandom should help at all. This kind of hoop jumping should be unnecessary.
I want a random device that "just works", no blocking, no shouting cryptic warnings at me, leaving me to figure things out on my own, which I'm ill equipped to do anyway, since I'm no cryptographer or kernel developer. I shouldn't even have to know or care about these implementation details. Which might just change again from one kernel version to the next.
But that's something for the kernel devs to figure out.
I just don't like users installing
haveged
unnecessary because they read some 10 years old guide
There are plenty of guides out there that lead straight to data loss. haveged is a service that uses next to no resources, so worst case should be... still no harm done. If so, why bother so much.
If other ways to tweak the random device are still valid and necessary and even strongly recommended (like the random seed), then why not haveged too.
haveged has provided where kernel has left us hanging, for years. Otherwise it would not have been developed and not become remotely popular. Changing this mindset will take some time...
Hallo Andreas,
I fully agree that /dev/random
needs to be enhanced. Stephan Müller is trying to achieve this - he has proposed a new modern design, which addresses many problems of the current implementation. However, getting the changes to the mainline kernel is not easy - please check this email thread:
https://lkml.org/lkml/2021/11/21/143
I'm afraid that it will take a long time to improve the situation.
Jirka
So after all the above discussion is haveged still useful currently?
Hi Juha,
yes, it's still useful. It can provide entropy early in the boot when /dev/random is not fully utilized.
On a fully booted system, it can be still used as an additional entropy source. It will insert entropy into the kernel every 60 seconds, thus diversifying your entropy sources.
I hope this helps Jirka
Note that Kernel 5.10 still needs haveged;
uname -r && cd /proc/sys/kernel/random/ && cat poolsize entropy_avail
5.10.102
4096
557
and /dev/urandom / GRND_INSECURE are about to be disabled so haveged is more relevant than ever.
Thanks for that link! I think it indeed makes haveged
more relevant when you need randomness very early in the boot and kernel's RNG takes a long time to initialize (on most platforms, kernel RNG initializes pretty fast, though).
I'm running kernel 5.16 and I can confirm that on a fully booted system, /dev/random
and /dev/urandom
behave the same way (in fact, both devices are the same on a fully booted system with this kernel). Both devices are nonblocking and provide random data at a rate of around 200MiB/s on my laptop [1]. I'm getting similar results with kernel 5.14. This is with haveged disabled.
What haveged does in this case, it provides an additional entropy source. It does not affect the speed but makes kernel RNG more trusted.
[1]
pv /dev/random > /dev/null
[ 242MiB/s]
Currently the first thing that README.md mentions is that haveged is obsolete with recent kernels. Maybe that could be improved, since haveged still has its uses, even with recent kernels.
Good point, Juha!
I have updated the https://github.com/jirka-h/haveged/blob/master/README.md - check this commit: https://github.com/jirka-h/haveged/commit/bfff89f0a8568fe1ce974261c0e706be141e175d
Hi
Sorry for the harsh question, but with the jitter entropy added in v5.4 (merge commit, commit, LKML) and removal of the
/dev/random
blocking pool in v5.6 (commit, LWN), ishaveged
still useful/relevant? If yes, under what circumstances?- Kristian