QubesOS / qubes-issues

The Qubes OS Project issue tracker
https://www.qubes-os.org/doc/issue-tracking/
534 stars 46 forks source link

Support for EFI boot #794

Closed marmarek closed 8 years ago

marmarek commented 9 years ago

Reported by marmarek on 4 Feb 2014 02:53 UTC EFI is more and more popular, soon there will be some machines with EFI-only boot, so we need badly support for it. Adding such support may be non-trivial, because Xen 4.1 do not support direct EFI boot (AFAIR it is supported in >=4.2 or even >=4.3). But perhaps some workaround like grub-efi can be used.

Migrated-From: https://wiki.qubes-os.org/ticket/794

marmarek commented 9 years ago

Modified by joanna on 14 Mar 2014 12:51 UTC

marmarek commented 9 years ago

Modified by joanna on 20 Apr 2014 17:07 UTC

marmarek commented 9 years ago

Comment by anonymous on 22 Oct 2014 19:50 UTC Please mention this problem on system requirements page. As of now, having a laptop that meets those reqs but lacks UEFI legacy mode (like asus s300c) as the only system may lead to a long trial-and-error quest without any possible resolution.

marmarek commented 9 years ago

Comment by anonymous on 3 Nov 2014 13:56 UTC Important as most of the new laptops and desktops use EFI

marmarek commented 9 years ago

There is progress here, some preliminary code here: https://github.com/marmarek/qubes-installer-qubes-os/tree/efi

Currently the problem is with ISO9660 boot format ("El Torito"). It support boot images up to 32MB, our have 56MB (which includes 40MB of initrd.img).

How it works

  1. ISO9660 with El Torito extension contains "boot catalog" which points at images used to boot. Our image contains three entries here:
    1. for BIOS (isolinux/isolinux.bin),
    2. for UEFI (images/efiboot.img, which is fat32 image of EFI System partition)
    3. for Macs (images/macboot.img, similar to efiboot.img but with some Mac specific modifications)
  2. Additionally our image have partition table added by isohybrid, for having the same image for USB sticks. This partition table is prepared such a way that each partition points at appropriate boot image (which exception for BIOS, which just uses syslinux-provided code for MBR). Details on osdev wiki and syslinux wiki
  3. EFI boot image contains shim.efi as BOOTX64.efi (not really useful until we got SecureBoot working), which loads grubx64.efi. It presents the user simple menu ("install" and "check media & install"), then loads xen.efi or xen-check.efi, which are just copies of the same file.
  4. xen.efi (or xen-check.efi) reads its configuration file (xen.cfg or xen-check.cfg), which contains names of kernel and initrd to load. Then loads those images using EFI services.

The problem

"El Torito" boot catalog structure have 16-bit field for image size (expressed in 512-bytes sectors). This means maximum image size is 32MB. If bigger is used, the field will be silently truncated and only part of the image would be loaded by UEFI firmware. I've found that out looking at hex dump of boot catalog... Then isohybrid would also use that truncated size for partition size, so even when this structure isn't directly used, the image would also be truncated. Fedora images doesn't have this problem, because kernel and initrd is loaded by grub, so do not need to be in EFI partition (grub has ISO9660 filesystem driver). But xen.efi cannot be loaded as multiboot binary yet and the only way to provide kernel and initrd is config file with its names, which will be loaded using EFI service - so the images must live on EFI partition.

Possible solutions

The most universal solution would be having smaller efiboot.img. I'm currently investigating this option, namely what we can rip of initrd.img. When this will be done, both DVD and USB version would be bootable under EFI.

But if we fails at it, we can still have EFI compatible USB version. We simply need fixing partition size to match efiboot.img size. But this way, image written on DVD would not work under EFI (even worse - it will looks like EFI compatible one, but will crash during startup).

Further steps/problems

  1. When provided correct partition table, Xen loads correctly, but Linux kernel hangs (last message was something about EFI Variables Facility). I haven't investigated this, maybe the kernel image still was invalid. Or some kernel configuration option missing.
  2. The same problem applies to Live edition, with the same possible solutions.
marmarek commented 8 years ago

Some progress has been made:

  1. I've managed to reduce efiboot.img to 31MB. Initrd is really limited there (no network support, dropped all SCSI drivers, dropped many unusual filesystem drivers etc).
  2. Found that apparently when USB stick contains ISO9660, UEFI firmware uses that information instead of MBR or GPT partition tables there (both present). And then "Simple Filesystem Protocol Interface" doesn't work. Breaking Eltorito Volume Descriptor "fixes" that problem (but probably makes the image unbootable when used on real DVD). Maybe its a bug in just my BIOS...
  3. Kernel still hangs at efivars initialization. But this time I'm pretty sure its my buggy hardware, because I've seen exactly the same kernel working on another platform.
Daerdemandt commented 8 years ago

Maybe its a bug in just my BIOS...

this time I'm pretty sure its my buggy hardware

Will other people trying out exactly the same image on their hardware to see what changes be of any assistance?

marmarek commented 8 years ago

I think so :) Here is the link: http://ftp.qubes-os.org/~marmarek/Qubes-20150901-x86_64-DVD.iso sha256sum: e166d9537d720bc2d7d60ff33d208bb84c2fd0ea71aa13551511ffb4e5ec236f

The image should be treated as untrusted!

Test instructions 1:

  1. Write the image to some USB device using dd
  2. Boot the image in UEFI mode (it contains both BIOS and UEFI boot code)
  3. You should get GRUB menu with just two options, on black background
  4. Switch to GRUB cmdline (c), enter chainload / and press TAB twice

Possible results:

Tell me which one do you see and continue to "Test instructions 2".

Test instructions 2 (continue from the point above):

  1. Return to the menu (ESC), try to launch setup (the option without checking the media).
  2. If that doesn't start and return to the menu, go to "Test instructions 3", otherwise wait for system to load

Possible results:

If you get here, you can try to install the system somewhere (do not override your primary system!). I haven't got that far, installation will probably fail. In such case, note the exact failure.

Test instructions 3:

  1. Break the ISO9660 boot descriptor (make backup of the image first):

    echo -en '\x03' | dd of=Qubes-20150901-x86_64-DVD.iso seek=16 bs=2048 conv=notrunc
  2. Write the image to some USB device using dd
  3. Continue "Test instructions 1" starting from point 2.

Test instructions 4:

Perform test instructions 1-2 using DVD instead of USB stick.

Daerdemandt commented 8 years ago

The image should be treated as untrusted!

Untrusted as in 'never boot from untrusted device' or is it just poor wording?

marmarek commented 8 years ago

On Mon, Sep 14, 2015 at 04:25:29PM -0700, Daerdemandt wrote:

The image should be treated as untrusted!

Untrusted as in 'never boot from untrusted device' or is it just poor wording?

Untrusted as it may steal your data, infect everything it touches and kill your cat. Build environment used to produce that image is far from clean and may be compromised. So at least I would disconnect disks with valuable data first.

Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?

marmarek commented 8 years ago

Regarding that ISO9660/grub/xen.efi problem loading vmlinuz and initrd - using refind instead of grub seems to solve the problem. No need to have separate images for DVD and USB :) Pity that Fedora has no package for it, so we need to maintain one in our repo...

marmarek commented 8 years ago

Regarding hang at efivars initialization, this happens also on Thinpad T430. Relevant xen-devel thread:

This is a firmware bug.

+1 (and I'm surprised how common this is)

The bug is present in the reference implementation code, which means it is present in a lot of real firmware. We have kit from 3 different vendors which are affected, including latest available firmware.

There are also patches attached, I'll check whether are available in any stable Xen version, or is the backport feasible.

marmarek commented 8 years ago

Regarding rEFInd/Grub - finally managed to get that working with Grub. When 'root' variable points at EFI partition on the image (not the whole device), it simply works.

New images, this time built in more trustworthy environment: http://ftp.qubes-os.org/~marmarek/Qubes-20150930-x86_64-DVD.iso http://ftp.qubes-os.org/~marmarek/Qubes-20150930-x86_64-DVD.iso.asc SHA256: 77845d9265579c30947641500f4a4b587fe77b3cdc58f79171cc2275c2000d3d

And bonus - EFI-enabled LIVE image: http://ftp.qubes-os.org/~marmarek/Qubes-20150930-x86_64-LIVE.iso http://ftp.qubes-os.org/~marmarek/Qubes-20150930-x86_64-LIVE.iso.asc SHA256: d73417c1911bf2ab904f287f2a1be1dc65d59f085d341b2745449bf5d00b0dab

Any feedback welcome :)

rootkovska commented 8 years ago

Problems with the latest EFI Live image:

  1. Plymouth runs in text mode only on the machine I tested this
  2. Deefault appmenus don't have Firefox (but they do have "Help", heh ;)
  3. Networking doesn't work on the machine I tested (and it worked with the previous iteration), the reason is: no firmware for iwlwifi-6000

So, I would say a few regressions from the previous (August) image.

marmarek commented 8 years ago

On Sat, Oct 03, 2015 at 01:18:42AM -0700, Joanna Rutkowska wrote:

Problems with the latest EFI Live image:

  1. Plymouth runs in text mode only on the machine I tested this

Also in legacy mode? In EFI mode initramfs is really limited (32MB limit...), so it may be hard to include plymouth theme (I'll try).

  1. Deefault appmenus don't have Firefox (but they do have "Help", heh ;)
  2. Networking doesn't work on the machine I tested (and it worked with the previous iteration), the reason is: no firmware for iwlwifi-6000

Strange, template is unchanged... I'll look into it.

Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?

rootkovska commented 8 years ago

On Sat, Oct 03, 2015 at 03:06:22AM -0700, Marek Marczykowski-Górecki wrote:

On Sat, Oct 03, 2015 at 01:18:42AM -0700, Joanna Rutkowska wrote:

Problems with the latest EFI Live image:

  1. Plymouth runs in text mode only on the machine I tested this

Also in legacy mode? In EFI mode initramfs is really limited (32MB limit...), so it may be hard to include plymouth theme (I'll try).

Yes, both in legacy and uefi (on uefi I tried only 3.1 actually, and the live edition I tried in legacy only).

marmarek commented 8 years ago

On Sat, Oct 03, 2015 at 12:06:12PM +0200, Marek Marczykowski-Górecki wrote:

On Sat, Oct 03, 2015 at 01:18:42AM -0700, Joanna Rutkowska wrote:

  1. Deefault appmenus don't have Firefox (but they do have "Help", heh ;)
  2. Networking doesn't work on the machine I tested (and it worked with the previous iteration), the reason is: no firmware for iwlwifi-6000

Strange, template is unchanged... I'll look into it.

Ah, I see. It happened that the default template is the Debian one. There is indeed no firmware (since that is "non-free" package). And firefox is named "iceweasel" which is the reason why it it isn't in the menu.

Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?

dizmoduck commented 8 years ago

On a Asus UX31A I can now boot, but Qubes OS stops just before getting GUI mode

Is there a newer image the 20150930 I can test?

marmarek commented 8 years ago

Did you tried installation image, or live one?

dizmoduck commented 8 years ago

Sorry fore the low info level I try to figure out how to copy the error log file to the usb stick but couldn’t find out how to mount from the /dev/

I'm trying the live one fist before I install to test how it woks

http://ftp.qubes-os.org/~marmarek/Qubes-20150930-x86_64-LIVE.iso

marmarek commented 8 years ago

I try to figure out how to copy the error log file to the usb stick but couldn’t find out how to mount from the /dev/

The live image have ISO9660, so you can't use the same stick to save some data there. But if you plug another one, it should be visible in the system.

For the live image, there is slightly newer one - 20151003, but I don't think it will make a difference in X drivers.

dizmoduck commented 8 years ago

I can't attach the file to this comment? Were can I send the rdsosreport.txt file? .. Attaching documents requires write permission to this repository. Try again with a PNG, GIF, or JPG.

marmarek commented 8 years ago

Try gist.github.com.

Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?

dizmoduck commented 8 years ago

Thanks I'm a nb for Github her is the rdsosreport.txt https://gist.github.com/dizmoduck/68bd69cd5e00d46f53b9/archive/4676e5ed222f645ff89f64a6559e878de5dffc18.zip

kramse commented 8 years ago

I have almost the same laptop Asus Zenbook UX32A, and if it would help I could ship it and lend it out for some months?

gnustomp commented 8 years ago

Using the 20151003 live image, UEFI boot fails on my ThinkPad T450s at "EFI Variables Facility", but succeeds on a ASUS P8Z77-V LX motherboard and ASUS Zenbook UX32VD.

However, I can't install because the installer reports that it needs around 200GB of disk space. I'll try with the DVD and report back. The DVD image is able to install normally.

dizmoduck commented 8 years ago

Are there any newer iso we can test than the http://ftp.qubes-os.org/~marmarek/Qubes-20151003-x86_64-LIVE.iso ?

dizmoduck commented 8 years ago

Jest test the 20151003 image and it didn’t work eider same error as image 20150930

ManoftheSea commented 8 years ago

I have tested with a Dell Latitude E6220, with an Intel Pro 2500 SSD. With the 20150930 image: Xen loads fine, and I even get a console from dom0 Linux with 4 penguins (2 cores * HT = 4 images). And it's frozen from there. Is there anything additional I can do to help without deep technical knowledge of Qubes? My normal boot on this system uses GRUB as the UEFI app which loads Linux (normally) or Xen and Linux (have seen it work).

marmarek commented 8 years ago

Do you see normal kernel messages there? I guess the last one is about EFI variables - if so - it is some EFI firmware bug, known on other systems as well. There is a workaround in Xen for that (which sometimes work, but for my Latitude E6420 it doesn't), you can enable it by adding editing default Grub entry and adding following options to chainloader line: /mapbs /noexit (you can try any of them, or both)

ManoftheSea commented 8 years ago

Negative, absolutely no text output at all, the system is just frozen after the four penguins.

Is there a way to get Qubes installed without going through the installer? I've done debootstrap installs before, is it just a matter of pointing at the correct repositories and picking the right packages?

marmarek commented 8 years ago

Negative, absolutely no text output at all, the system is just frozen after the four penguins.

IMHO it still worth trying that workaround. If it still doesn't work, the easiest way would be switching to legacy mode...

Is there a way to get Qubes installed without going through the installer? I've done debootstrap installs before, is it just a matter of pointing at the correct repositories and picking the right packages?

I guess yes, but it will be non trivial and isn't supported in any way. Anyway, there is a tool called febootstrap (haven't tried), and use installation image as package source (baseurl=file:///mnt/...).

fabian-z commented 8 years ago

Trying to install Qubes with the latest 3.1-rc1 ISO on a USB stick, I have problems booting via UEFI instead of legacy BIOS boot. The legacy boot works fine. If I try to boot via UEFI, I get the error message

\EndEntire
file path: /ACPI(a0341d0,0)/PCI(0,1d)/USB(1,0)/USB(3,0)/HD(1,3e4,fde4,2a6be34c00000000,0,0)/File(\EFI\BOOT)/File(xen.efi)/EndEntire
Xen 4.6.0 (c/s ) EFI loader
Unsupported device path component

directly after pressing Enter in the Grub boot screen. After a brief moment it returns to the Grub selection again. In fact this message is gone so quickly, I had to screencap it out of a recorded video. As a workaround, I succeeded in booting xen.efi directly using an UEFI v1 shell (assuming fs0 is the Qubes image):

Shell> fs0:
FS0:\> cd efi
FS0:\EFI\> cd boot
FS0:\EFI\BOOT\> xen.efi placeholder qubes-check

The installation is running, but I can't yet comment if the resulting system is bootable out of the box. Update: Anaconda seems to setup everything EFI related correctly for me. The resulting system is properly bootable without any changes.

marmarek commented 8 years ago
Shell> fs0:
FS0:\> cd efi
FS0:\efi\> cd boot
FS0:\efi\boot\> xen.efi placeholder qubes-check

That's strange - it should be exactly the same as grub entry...

Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?

ideologysec commented 8 years ago

I also had the same experience as fabian-z. I have a MacBook Pro 11,5 (15" retinaMBP from 2015), and used dd to copy the Qubes 3.1 image to a flash drive. When booting, I get the exact same error message, except this being a Mac, I can't drop to an EFI shell to continue booting, and there's no legacy BIOS emulation option to fall back on (unlike older Macs).

I'll try installing rEFInd and manually booting the Qubes image and report back.

Is this an issue with UEFI 1? (Since Apple's EFI was forked back when it was still v1, and fabian mentioned a UEFI v1 shell..)

ideologysec commented 8 years ago

Installed rEFInd, which found no less than four bootable items on the created flash drive installer. I chose "Boot EFI\BOOT\xen.efi from Anaconda" and the installer loaded properly and allowed me to set a destination volume. I installed to a 32gb USB flash drive, which went off without a hitch.

I'm unable to boot from that installed drive, no matter which of the five options I select (fallback, xen, fedora efi, gdc I think, and vmlinuz) - each option brings me only to a grub command line, except for vmlinuz, which hangs.

That seems to be a problem for a different day (and likely related to the Mac's EFI, not the Qubes installer), though if there is a place I should be looking for answers, please let me know. I'm also happy to test further and report back as much as I can.

marmarek commented 8 years ago

Both reports - from @fabian-z and @Aktariel - suggests that grub2-efi is messing something up with launching EFI application (chainloader command). Not sure why only sometimes, but probably depends on EFI implementation (device path or so). Since rEFInd seems to not have this problem, will try this approach. This is already mostly implemented (before finding out how to deal with Grub2): https://github.com/marmarek/qubes-installer-qubes-os/commit/f6828ff16a784f161c8a5e38b3b32c417053ae79

infinitesnow commented 8 years ago

Hello, I have an older Macbook Pro (Late 2010) and I can confirm the problem. I'm trying to run the 3.1rc1 install dvd from a pen drive I created with dd. Also, using rEFInd, I'm unable to start both the legacy mode (Complains "No Bootabe device found, insert boot disk and press any key") both the EFI mode (xen.efi from Anaconda) with the "placeholder" option, which returns "No dom0 kernel specified". The pen drive is detected by native Apple boot manager too. If I try to run Qubes from the shell or from the menu entry, both ways give me the error stated above:

\EndEntire file path: /ACPI(a0341d0,0)/PCI(0,1d)/USB(1,0)/USB(3,0)/HD(1,3e4,fde4,2a6be34c00000000,0,0)/File(\EFI\BOOT)/File(xen.efi)/EndEntire

(device names are probably different, but I don't get the last error line, it just doesn't boot).

I'm available for further tests.

blobbelen commented 8 years ago

Negative, absolutely no text output at all, the system is just frozen after the four penguins.

Did you solve that problem ManoftheSea ? because I seem to have the same issue.

The kernel just shows the penguins, when I try UEFI mode I get the message "Ignoring BGRT: invalid status 0 (expected 1)" which seems to be a harmless warning. And when I try legacy boot mode there is just the penguins and nothing happens. I don't know much about all the new UEFI boot stuff, but I suspect we are already past the booting and my problem comes from something completely different and unrelated to the original thread, but since ManoftheSea had similar issues I hope it is okay I post here first.

infinitesnow commented 8 years ago

I don't know if it is related to this issue, but I have found out that there seems to be a problem with the GPT partition - for example, the name of the partition is a string of chinese characters, and distros that run correctly with my Mac (e.g. Debian netinstall) don't show weird behaviors like this. The exact string is mentioned here and here. In the syslinux mailing list, they mention a truncation problem which looks similar to the one mentioned above.

I'm trying to be helpful but I'm not really able to do much. Let me know if I can help.

marmarek commented 8 years ago

On Sun, Dec 20, 2015 at 05:33:17AM -0800, infinitesnow wrote:

I don't know if it is related to this issue, but I have found out that there seems to be a problem with the GPT partition - for example, the name of the partition is a string of chinese characters, and distros that run correctly with my Mac (e.g. Debian netinstall) don't show weird behaviors like this. The exact string is mentioned here and here. In the syslinux mailing list, they mention a truncation problem which looks similar to the one mentioned above.

We specifically for this reason have that partition smaller than 32MB. Chinese characters in partition name AFAIR is harmless side effect of isohybrid.

(xen.efi from Anaconda) with the "placeholder" option, which returns "No dom0 kernel specified".

Try booting it without any parameter. That "placeholder" is workaround for grub2 behaviour.

Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?

vantonio commented 8 years ago

I'm having the exact same problem as @ManoftheSea did on a Gateway NV47H, which used to dual boot Windows and grub in EFI mode. Booted from USB disk in text mode and frozen in the screen with four penguins. Anyway to workaround this or should I wait for next RC?

marmarek commented 8 years ago

On Sun, Dec 20, 2015 at 08:43:41PM -0800, vantonio wrote:

I'm having the exact same problem as @ManoftheSea did on a Gateway NV47H, which used to dual boot Windows and grub in EFI mode. Booted from USB disk in text mode and frozen in the screen with four penguins. Anyway to workaround this or should I wait for next RC?

Try adding "/mapbs" parameter to xen.efi (you can edit entry in grub menu).

Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?

vantonio commented 8 years ago

@marmarek thanks for the tip, unfortunately adding /mapbs at the end of the chainloader line did not help. Any other ways to try or how could I be of help to debug this?

marmarek commented 8 years ago

Try also "/noexit" (alone or together with "/mapbs").

infinitesnow commented 8 years ago

(xen.efi from Anaconda) with the "placeholder" option, which returns "No dom0 kernel specified".

Try booting it without any parameter. That "placeholder" is workaround for grub2 behaviour.

This returns "Warning: could not query variable store: 0x800..0019" Than it displays a messed up screen with what looks like the Linux logo, all scrambled throughout the whole width of the screen, seems to display a further line (not readable), then hangs.

vantonio commented 8 years ago

Try also "/noexit" (alone or together with "/mapbs").

still had no luck but a frozen screen.

ideologysec commented 8 years ago

So, I've taken the plunge, made a backup, and installed Qubes R3.1, in the process reclaiming all space on the internal drive (deleting all internal partitions). The install went off without a hitch. I had to use an external USB to boot rEFInd to chain to Qubes, but it worked.

Ran into errors booting after the fact, though - dracut is complaining "FATAL: No or empty root argument", and no matter which option I select in rEFInd it either gives me the same issue, or kicks me over to a grub command prompt. I've attached shots of the errors. The first one is attempting to boot vmlinuz-3.19.8, and the second is attempting to boot vmlinuz-4.1.13.6-pvops. (I'm still having to use rEFInd to get boot options for Qubes - using the Apple Boot Chooser just gives me an "EFI Boot" option that send me straight to a grub shell). img_1249 img_1252

img_1247 img_1256

I also looked at the partition table in a Fedora live install, and have attached that image as well - looks like Qubes preserved the 200MB HFS+ EFI partition at the root of the drive.

img_1257

Strangely also (and perhaps I'm looking in the wrong place), the /boot folder on the LUKS Qubes volume is completely empty.

I realize I'm a bit off the reservation here since Macs are not very well supported, but I'd really like to get this booting. Should I post to the Google Group?

marmarek commented 8 years ago

You should boot xen.efi from rEFInd, not vmlinuz. Not sure if it is listed automatically.

noseshimself commented 8 years ago

I tried UEFI booting on a Lenovo W540 hit the same problems. 7 or 8 fat birds but after that I'm getting the same results as @vantonio. But the W540 is obviously even worse as I don't get 3.1 RC1 installed on BIOS boot anymore.