coreos / bugs

Issue tracker for CoreOS Container Linux
https://coreos.com/os/eol/
146 stars 30 forks source link

Consider applying CPU microcode updates at boot #1862

Closed bgilbert closed 6 years ago

bgilbert commented 7 years ago

Issue Report

Feature Request

Environment

What hardware/cloud provider/hypervisor is being used to run Container Linux? Bare metal

Desired Feature

Apply CPU microcode updates at boot.

Other Information

Microcode updates are not persistent; they need to be applied on every boot. The system firmware may apply an update, but it might not be the most recent version because the hardware OEM may not issue a firmware update for every microcode release and end users may not install it.

Opinions differ on how important this is. One source with knowledge of x86 CPU development says that CPU bugs fixable by microcode updates are not a major source of errors in production. The microcode-update documentation of various Linux distros claims that it's very important to run current microcode to improve system stability.

The non-deprecated mechanism for updating microcode is the early-microcode driver. It can be supplied with AMD microcode from the amd-ucode directory of linux-firmware and Intel microcode from sys-firmware/intel-microcode. Because bootengine.cpio is an uncompressed cpio archive and not an initrd, I think we can store both firmware blobs in it directly and do not need to prepend a separate cpio archive.

iucode_tool, the program that reformats Intel's microcode images for use by the kernel, includes documentation stating a) that Intel's release management is opaque, and b) that Intel drops firmware for old CPUs, potentially causing regressions on older hardware for distros that ship only the latest microcode bundle. iucode_tool provides an example mechanism for downstream release management to avoid this -- which assumes that Intel has not dropped any microcode newer than 2008.

mark-kubacki commented 7 years ago
dmesg | grep -F micro
[    3.240610] microcode: sig=0x406f0, pf=0x1, revision=0x14
[    3.246343] microcode: Microcode Update Driver: v2.2.

… shows that microcodes can (and indeed are) updated on every boot.

I guess you want to make a persistent change? If so, to my latest knowledge the mainboard vendor supplies these with BIOS/UEFI updates. You might find that those are sometimes signed, so don't allow any persistent modifications.

bgilbert commented 7 years ago

Those messages only show that the microcode driver is loaded. There'd be a microcode updated early message if an update had been applied.

bgilbert commented 7 years ago

Note the --early-microcode option to dracut.

redbaron commented 6 years ago

Is there any workaround to have microcode installed early in the boot proces?

redbaron commented 6 years ago

For the reference, I made it work with iPXE boot so far, not sure if this method is fully compatible with coreos-install.

  1. Get "Linux processor microcode data file" from https://downloadcenter.intel.com/search?keyword=microcode+linux
  2. Get iucode_tool utility from https://gitlab.com/iucode-tool/iucode-tool/
  3. unpack microcode archive, run following iucode_tool --verbose -t d microcode.dat --write-earlyfw ucode.img
  4. Setup iPXE script passing ucode.img before normal initrd. Here is my version for UEFI system:
    #!ipxe
    kernel http://matchbox/assets/coreos/1520.8.0/coreos_production_pxe.vmlinuz coreos.config.url=http://matchbox/ignition?... initrd=ucode.img initrd=coreos_production_pxe_image.cpio.gz coreos.first_boot=yes
    initrd http://matchbox/assets/ucode.img
    initrd http://matchbox/assets/coreos/1520.8.0/coreos_production_pxe_image.cpio.gz

verify that it works by checkingdmesg | grep microcode, in my case the very first line I have is:

[0.000000] microcode: microcode updated early to revision 0xb000021, date = 2017-03-01

hope it helps

mark-kubacki commented 6 years ago

FYI, it's not always desired to run the latest microcode. For example, in early microcode turbo-mode(s)–envelope has not been set for E5 V3 Xeons (primarily for models intended for OEMs, such as 2676, 2683, 2696…). At the expense of manageable bugs servers are run with better throughput.

redbaron commented 6 years ago

Here is how to make it work when installing with coreos-install script:

#!/bin/bash -ex
curl --fail --retry 10 "http://matchbox/ignition?{{.request.raw_query}}&os=installed" -o ignition.json
coreos-install ....
udevadm settle

# configuring grub to load microcode initrd
mount /dev/disk/by-partlabel/OEM /mnt
curl --fail --retry 10 "http://matchbox/assets/ucode.img" -o /mnt/ucode.img
cat >>/mnt/grub.cfg <<"EOF"
set default="coreos-microcode"
set saved_oem="$oem"  #<<<< PITA
menuentry "CoreOS default with CPU microcode update" --id=coreos-microcode {
  gptprio
  linux$suf $gptprio_kernel $gptprio_cmdline initrd=ucode.img $linux_cmdline
  initrd$suf ($saved_oem)/ucode.img
}
EOF
umount /mnt

systemctl reboot
bgilbert commented 6 years ago

We'll need this for Spectre mitigation on machines without updated firmware.

megastallman commented 6 years ago

Hi everyone! There is much discussion about microcode bundling, but at the moment CoreOS has no microcode bundled at all. That is a reason why I had to disable hyperthreading on our Dell Poweredge-430 servers, which have been rebooting once a couple of days, keeping our cluster unusable. The BIOS and iDRAC firmware are the latest for now. Centos 7 has proven to work on these machines, containing the "intel microcode" package. I've checked my Kubuntu laptop and it also has this package installed, but the CoreOS, being basically a Gentoo build - doesn't. There is already an ebuild for Gentoo, which I've always included before I moved back to Kubuntu. So you can just do this way: https://wiki.gentoo.org/wiki/Intel_microcode

The possible reason of that behavior is here: https://lists.debian.org/debian-devel/2017/06/msg00308.html https://www.opennet.ru/opennews/art.shtml?num=46762 (actually the same here)

Thanks in advance!

ajeddeloh commented 6 years ago

We're currently working on adding microcode support and expect to ship it in the next alpha.

redbaron commented 6 years ago

@megastallman , you can add microcode support yourself with instructions above

megastallman commented 6 years ago

@ajeddeloh @redbaron Thanks!

ajeddeloh commented 6 years ago

Closed via https://github.com/coreos/coreos-overlay/pull/3010 and https://github.com/coreos/portage-stable/pull/634

We're not shipping the most recent intel microcode due to reports of it causing instability. Once that is resolved, we will ship the most recent.

lorenz commented 6 years ago

Intel have released stable microcode for Spectre mitigation (at least down to Sandy Bridge). The newer version available message is wrong, the linked download is the latest release.

dm0- commented 6 years ago

Thanks, this will be in the next alpha (planned for release on Monday).

mark-kubacki commented 6 years ago

FYI, grub 2.04 is expected to support early-initrd image loading, specifically for the use-case of microcode updates.

http://git.savannah.gnu.org/cgit/grub.git/commit/?id=a698240df0c43278b2d1d7259c8e7a6926c63112
via: https://bugs.gentoo.org/645088

Could be handy to roll out your grub binary with this.

bgilbert commented 6 years ago

@wmark Container Linux uses neither grub-mkconfig nor a separate initrd (ours is actually compiled into the kernel), so that patch isn't relevant to us. Thanks for the pointer, though!

lorenz commented 6 years ago

@dm0- One more thing if you're intending to release 1688.x as stable on monday too: It currently ships 0x23 for Haswell which Intel pulled later on for causing issues. So the current beta is still shipping buggy microcode (at least for some processors) and should probably not be released to stable.

dm0- commented 6 years ago

@lorenz We've backported the latest microcode to the beta branch so that it will be promoted to the next stable. Releases for all channels are scheduled to go out on Tuesday, so all channels should have it then.