Dasharo / dasharo-issues

The Dasharo issue tracker
https://dasharo.com/
25 stars 0 forks source link

Add ESP partition scanning in look for grubx64.efi or shimx64.efi or Windows bootmgr #94

Closed miczyg1 closed 10 months ago

miczyg1 commented 2 years ago

The problem you're addressing (if any) Some Linux distros do not leave BOOTX64.EFI or BOOTIA32.EFI file on the ESP which causes the driver to be undetected by UEFI payload. The UEFI specification defines that there must be \EFI\BOOT\BOOTX64.EFI or \EFI\BOOT\BOOTIA32.EFI file present on the ESP partition. Otherwise the media is not detected as bootable by EDK2 UEFIPayload. Ubuntu on the other hand leaves a BOOTX64.EFI file so the drive is always detected and when the file is executed, a new boot menu entry is automatically added for Ubuntu. This is how they solved the problem.

Describe the solution you'd like Dasharo UEFIPayload could scan for grubx64.efi or shimx64.efi in the ESP partition and add a boot entry automatically to show the system installed on a drive on the boot menu list. This can happen when the UEFI variables are wiped out or when migrating from MSI firmware to Dasharo. Similar solution can be applied to Windows Boot Manager.

Requirements:

  1. Firmware shall check the presence of the following files and create boot menu entries with names given after the dash - (<disk_name> is a human readable disk model which normally would be shown in boot menu when \EFI\BOOT\BOOTX64.EFI is present):
\EFI\Microsoft\Boot\bootmgfw.efi - Windows Boot Manager (on <disk_name>)
\EFI\Suse\elilo.efi - Suse Boot Manager (on <disk_name>)
\EFI\Ubuntu\grubx64.efi Ubuntu (on <disk_name>)
\EFI\Ubuntu\shimx64.efi - Ubuntu (on <disk_name>)
\EFI\Redhat\elilo.efi - RedHat Boot Manager (on <disk_name>)
\EFI\Redhat\grubx64.efi - RedHat (on <disk_name>)
\EFI\Redhat\shimx64.efi - RedHat (on <disk_name>)
\EFI\Fedora\shimx64.efi - Fedora (on <disk_name>)
\EFI\Fedora\grub64.efi - Fedora (on <disk_name>)
\EFI\Centos\shimx64.efi - CentOS (on <disk_name>)
\EFI\Centos\grubx64.efi - CentOS (on <disk_name>)
\EFI\opensuse\grubx64.efi - OpenSuse (on <disk_name>)
\EFI\debian\grubx64.efi - Debian (on <disk_name>)

The above files may need verification if they are correct. All OSes listed above shall be tested.

  1. The above files shall be only be checked on non-removable media (the driver needs to check if the device is removable, e.g. USB has such checks). Or optionally there can be a boot menu option added to DasharoModuelPkg which will control adding OS boot option from removable media on demand.
  2. The driver must prioritize check for shimx64.efi file.
  3. The driver cannot add duplicated entries, i.e. if a boot option has been created for the OS using shimx64.efi file, the driver should ignore grubx64.efi check for the same OS in the same directory on given disk.
  4. The boot options should not be persistent, because someone may remove a disk but boot option will still be there, which may introduce confusion. The boot options should be refreshable using standard EfiBootManagerRefreshAllBootOption call and should have mBmAutoCreateBootOptionGuid in its BootOption->OptionalData field.
  5. SPECIAL CASE: detect DTS and populate a boot menu entry with Dasharo Toosl Suite on <disk_name> where disk name would be the standard EDK2 boot option name created without the changes introduces by this feature. TO be clarified how one can detect DTS

One may simply extend the MdeModulePkg/Library/UefiBootManagerLib to perform the above tasks or create a new Library/driver.

Where is the value to a user, and who might that user be? Reduce the confusion of the users or the impression that something doesn't work as expected.

Describe alternatives you've considered None

Additional context None

pietrushnic commented 2 years ago

@miczyg1 looks like very nice enhancement which deserve documentation entry and release note bullet when it would be implemented.

handmeatowel commented 2 years ago

Suggesting to add those workarounds for at least the major established distros (if required). I'd happy to quickly test against some of them if needed.

Here's a list of major distros according to distrowatch.com:

miczyg1 commented 2 years ago

I have envised it in the following way:

Typically Linux distros create the following structure, e.g. Ubuntu: /boot/efi/boot/ubuntu/ where one may find grubx64.efi or shimx64.efi. /boot is only the mountpoint typically while /efi/boot is required directory structure by UEFI spec. Soo each distro will create a different directory under /efi/boot. It will be ubuntu/fedora/etc. I would take this string if grubx64.efi or shimx64.efi is found under this directory and populate a boot menu entry with the name of this directory. But if it is not created by the distro, then the question is what should happen.

handmeatowel commented 2 years ago

Not sure if this is planned for all of Dasharo targets. If it isn't the msi_ms7d25 label is missing here.

miczyg1 commented 2 years ago

This is generic improvement for all Dasharo targets

macpijan commented 2 years ago

@miczyg1 Can we confirm which OS triggered this problem? This was Debian, IIRC.

zirblazer commented 2 years ago

I had a few grandiose ideas about how to implement an overengineering Firmware-side Boot Manager, but never wrote them down in a way that makes sense :D For one, the greatest issues are:

1 - Whenever you want an unified menu where you can have both BIOS and UEFI boot targets simultaneously. As far that I know payload is either SeaBIOS or TianoCore so you just need one of those, and TianoCore with SeaBIOS as CSM is possible but is quite an untouched code path.

2 - Whenever you want to ignore standard UEFI Boot Entries procedures. For example, you may want to have an unmodificable boot default and don't want any OS side tools like efibootmgr to modify the UEFI Boot Entries to make it boot something else, yet still want these modifications to reach UEFI NVRAM to be standard compliant, just not applied without user consent.

3 - Whenever you want to support specific OSes or not. For example, for compatibility purposes, both Windows and Linux distributions seems to always hijack the \EFI\BOOT\BOOTX64.EFI file with a copy of their own then point it to some other Boot Manager. I would equate than if you are doing UEFI Boot and find a ESP, blindly loading \EFI\BOOT\BOOTX64.EFI is equivalent to BIOS-MBR boot where you just blindly load the first Sector of a disk then let the Boot Manager/Boot Loader being handled entirely disk side. This would be my prefered, most generic way to do UEFI Boot.

4 - How to present multiple diskses of the same model. Attached screenshot of MSI Firmware with a Crucial SSD, two identical Gigabyte SSDs, and two identical Kingston DataTraveler USB Flash Drives. Do we need SATA/USB Port? Serial Number? How to known which is which? msi_bootmanager Also note on that photo that there is no generic \EFI\BOOT\BOOTX64.EFI entry, that will make me more confident than "Windows Boot Manager" or "UEFI OS".

5 - If you want a Firmware side File Explorer and a proper way to make custom UEFI Boot Entries without needing to boot OS to use efibootmgr. Some propietary Firmwares do expose means of manually adding or deleting Boot Entries and even Drivers: https://images_bios.pugetsystems.com/810640.jpg But that is a Server Supermicro, on consumer doesn't seem that common.

miczyg1 commented 2 years ago

Ad. 1. Yes, we have just UEFI entries, no CSM yet and I doubt we will need it. Plus Intel is ditching legacy boot. Ad. 2. I don't understand this point at all. Ad. 3. \EFI\BOOT\BOOTX64.EFI is the standard file to boot from removable media according to UEFI spec. But UEFI Spec doesn't say what to do if OS created boot option after installation disappears (variables got cleaned) or disk is removed or another disk is installed. This method is UEFI standard and will not be modified. Ad. 4. That is a problem indeed. So how do you expect to solve it? A typical user will not distinguish which physical controller port the given disk is installed in. Nor will they distinguish the disks if we put the disk's serial number to the option name. I guess we will have to live with duplicate entries. But EDK2 has an advantage over the above screenshot. When you enter setup and then One Time Boot, the highlighted boot option is expanded in the help window on the right side, i.e. full UEFI-compliant device path is printed, so one may see which SATA or PCIe port is used for the disk, but one still needs to know which physical SATA or PCIe port the option corresponds to. One idea would be to add some mapping of the UEFI-compliant device paths to human readable names, i.e. if the path is PCI 1c.4 it means it is NVMe on PCIe Port #4 for example, in case of data it could be SATA Port #1 for example.

Ad. 5. We already have that for a long time. We have driver and boot options menu in the Boot Maintenance Manager (accessible from the main setup page)

zirblazer commented 2 years ago

1 - If you're intending to do something modular that can be used by other Dasharo ports, the older platforms seems to favour BIOS, so thinking on whenever you can make a unified list isn't a bad idea.

2 - The idea is that the Firmware shouldn't be100% dependent on the standard way that UEFI specification operates regarding the Boot Entries. The case I'm thinking of is if you want the Firmware to have a fixed way of booting manually set in-Firmware, yet allow OSes to use efibootmgr or similar to create/delete UEFI Boot Entries from its side, and these being reflected in Firmware since it acknowledges the changes and saves them to NVRAM instead of ignoring them (Reporting a successful write but doing nothing), but that it do not have any effect unless configured to do so. Think of it as two different ways to boot: You follow standard UEFI specification, or you ignore it and use an alternate, fixed boot arrangement. If using the latter, Firmware sticks to it even if there are added/removed/changed UEFI Boot Entries. Sort of like pre-UEFI where you couldn't change HD boot order from within OS. I'm not fond of OS capability of being able to modify what is going to boot next, yet a NVRAM read-only mode may not be ideal because OS installers may panic when trying to create their own UEFI Boot Entries.

3 - What I mean is that I'm assuming than on any ESP with something installed, there will be a \EFI\BOOT\BOOTX64.EFI (Which is the most generic, blind way to do UEFI Boot), but specific OSes also have their own Boot Loader, like Ubuntu installed in UEFI has \EFI\UBUNTU\SHIMX64.EFI and \EFI\UBUNTU\GRUBX64.EFI My understanding is that you want to add logic to scan for these files and automatically create a boot entry for Ubuntu or other OSes when matching paths are found instead of blindly loading the main \EFI\BOOT\BOOTX64.EFI which may point to any other place according to which OS replaced it with its own.

4 - Some Firmwares seems to include Motherboard photos with named Ports to identify them and whenever there is something plugged there, so you could potentially have a picture of the MSI rear panel with an overlay that gives Ports a name, and perhaps some color to denote than they're populated with something booteable as a graphical alternative to pure text. Making a shortname of the full path for a Port to identify where a drive is plugged into could also work. Note than I don't believe than users will not remember a few numbers and letters of a Serial Number if they use often that system, more so if the Firmware easily shows it so I know what drive I'm actually booting from. Point is, as MSI currently does it is unviable because there is nothing that helps to identify what is where, even though I do know where the stuff is physically plugged in. Note than any method will be complicated when you get to identify PCIe Addresses and such, since you could add something like a PCIe-to-multiple M.2 adapters card and have two NVMe SSDs in the same PCIe Slot or so. Same with the two M.2 Slots that can be either PCIe or SATA, SATA Port Multiplexers, or any other weird thing like extra booteable HBAs. Making a "universal" menu that contemplates edge cases is hard.

miczyg1 commented 2 years ago

Ad.2. Now I completely don't understand it at all and miss the point. Ad.3. Yes that's the goal of this feature request. Ad.4. Now we jump to some corner cases. We can think of infinite combinations which will make this feature request completely undoable. Can we stop creating scope creeps? Feature must deliver what has been defined in point 3 of your above comment and in the feature description. The end. If you need yet another change to how the boot device selection should work, please create another issue

pietrushnic commented 2 years ago

I guess it would be great to have a guide for developing this feature using QEMU. I guess anyone who finished OST2 Arch4021 should be able to complete this task because Arch4021 explain how to setup build environment and made basic modification of EDKII.

miczyg1 commented 2 years ago

@pietrushnic guide will be the same for QEMU. There is no particular difference whether it is QEMU or real hardware, except that QEMu won't be able to emulate NVMe or eMMC for example to have full test coverage. QEMU is able to attach disk images like live ISOs for installation, so it can as well attach disk images with preinstalled OS. That way one may test if the feature has been implemented correctly.

zirblazer commented 2 years ago

QEMU does emulate NVMe and eMMC, but you would need to test if it does what you want:

NVMe via -device nvme https://qemu-project.gitlab.io/qemu/system/devices/nvme.html

SD Controller emulation via -device sdhci-pci and -device sd-card https://stackoverflow.com/questions/61453355/qemu-to-emulate-sd-bus-and-card

miczyg1 commented 2 years ago

@zirblazer ohh I didn't know that. Even better...

pietrushnic commented 1 year ago

@maheshtammisetti first step should be preparation of documentation how to run Dasharo in QEMU - the assumption here is that Dasharo in QEMU would be used for automated validation and development of features which are hardware independent.

@miczyg1 what would be useful is to decide what branch/code base of Dasharo should be used for that effort. We should use the most feature-rich version that can work on QEMU.

miczyg1 commented 1 year ago

@miczyg1 what would be useful is to decide what branch/code base of Dasharo should be used for that effort. We should use the most feature-rich version that can work on QEMU.

For this particular feature request, we should use:

  1. https://github.com/Dasharo/coreboot/tree/common-base-rebased
  2. https://github.com/Dasharo/edk2/tree/dasharo
  3. https://github.com/Dasharo/edk2-platforms/tree/master
maheshtammisetti commented 1 year ago

Worked on understanding the UEFI Driver model, Firmware volumes, how UEFI Device path works along with how the boot sequence is initiated, and the control flow of the boot sequence from the DXE to BDS was noted. Worked on understanding the UEFI protocols (DEVICE_PATH_PROTOCOL) and understood how the UEFI detects the boot manager in order to boot the given boot option and failed moves to the legacy boot option (BOOTX64.EFI) and failed moves to check for the boot options in the removable media.

Challenges - I don't understand how to extend the modularity for detecting the ESP, conceptually I understand that I have to work on the file BmBoot.c and on the function EfiBootManagerBoot which attempts to boot the legacy boot options, and if the legacy boot failed it boots using the removable media and that is where I have to implement the detection process. But I find it confusing as to how to move forward, like do I have to write a function or do I have to utilize the existing EfiBootManagerBoot function and write a condition to detect the ESP partitions also upon detection, I am confused about the generation of the boot options and if generated does it has to show up in the Setup Menu? I am trying my best to learn everything related to the booting process and how the device path works and everything related to it (attached notes). Please suggest me accordingly.

BDS.txt Driver Writing Note.txt BmBoot Notes.txt

miczyg1 commented 1 year ago

@maheshtammisetti it looks like your are going to deep with analysis. Digesting more and more information without any practice is getting you confused over and over again. You have QEMU OVMF, try to modify some code as an exercise, see what changes and how certain functions impact the boot options. That way it should be easier to know what pieces of code are responsible for what outcome in the boot options etc.

I am pretty busy right this week so i will be able to look at it next week.

pietrushnic commented 1 year ago

@miczyg1, please consider the scope of Arch4021. IMHO, most of the above was covered there. @maheshtammisetti, we need an efficiently delivered solution, not a research project. I'm very concerned about the performance in this task. It takes way too much of your and @miczyg1 time.

maheshtammisetti commented 1 year ago

@pietrushnic @miczyg1 then the other way I will try to modify the pieces of code and look at what affects the boot options and what the outcome looks like. I am trying my best to fast-forward this task as much as I can optimally, it's just digging down deeper got me a bit confused.

miczyg1 commented 1 year ago

@maheshtammisetti I have looked at the notes you have made and I must say that nearly half of it is unhelpful for this task:

BDS.txt - BdsAttributes, DXE Calls BDS, Console Devices, HII and VFR are irrelevant for this task Driver.Writing.Note.txt - that's basically the UEFI driver model which is described in the UEFI specification, I hope you didn't figure it out from the code... It could be easily read from the spec. BmBoot Notes.txt - that's essentially what you need (plus the notes about boot options from BDS.txt) to get the task done, what I don't understand is why do you analyze console and video controllers, while the task focuses on storage and boot options? This is going nowhere if you don't direct the focus to the parts which are relevant.

By quickly looking at the code in BmBoot.c and BmBootDescription.c:

  1. BmEnumerateBootOptions is where all boot options are created, this is the starting point. The boot options are created for medias which have BLOCK_IO, SIMPLE_FILE_SYSTEM (but not BLOCK_IO) or LOAD_FILE protocols installed on them.
  2. BmGetBootDescription is the function that creates the boot option name. If we do it well, the correct name will be already made by BmGetLoadFileDescription.
  3. What creates the boot options is EfiBootManagerInitializeLoadOption, which takes the Description from BmGetBootDescription which can take the description based on the handle being used as an argument:
  BmGetUsbDescription,
  BmGetDescriptionFromDiskInfo,
  BmGetNetworkDescription,
  BmGetLoadFileDescription,
  BmGetNvmeDescription,
  BmGetMiscDescription

So in short what has to be done is to loop through all medias with BLOCK_IO and gEfiPartitionInfoProtocolGuid to check if it is an EFI System partition (if you locate the protocol, it can be checked with PartitionInfo->System == 1). The boot medias will also have the gEfiSimpleFileSystemProtocolGuid which can be used to obtain an EFI_FILE_PROTOCOL instance for given filesystem. EFI_FILE_PROTOCOL has function to check the files, so you could use it to check for the presence of files listed in this issue. Then all you need to do is to extract the hard drive device path. DevPathToTextHardDrive function may be helpful to translate i to a human readable text. it is not directly usable but it can be used via ConvertDeviceNodeToText for example. To get the correct path for the bootable file, you would need to take the last node from the device path which starts with HD(...

This is a fragment which can take the description from a hard drive:

    DevicePathNode = FilePath;
    while (!IsDevicePathEnd (DevicePathNode)) {
      if ((DevicePathNode->Type == MEDIA_DEVICE_PATH) && (DevicePathNode->SubType == MEDIA_HARDDRIVE_DP)) {
        FilePath = (CHAR16 *)(DevicePathNode + 1);
        break;
      }

      DevicePathNode = NextDevicePathNode (DevicePathNode);
    }

The you need to append the file path to the description so it would look like this: HD(1, GPT, ...)/\EFI\Ubuntu\shimx64.efi

Then all you need is the boot option name from the issue (you may additionally need the disk name) and then call

EfiBootManagerInitializeLoadOption (
             &LoadOption,
             LoadOptionNumberUnassigned,
             LoadOptionTypeBoot,
             LOAD_OPTION_ACTIVE,
             Description,
             FilePath,
             NULL,
             0
             );
EfiBootManagerAddLoadOptionVariable (&LoadOption, 1);

Where the description would be Ubuntu (on <disk_name>). disk_name could be obtained with BmGetBootDescription based on the boot media device path.

So just start writing some code and testing it, otherwise, you will be stuck in theorizing what function does what and never accomplish the task. Mount an Ubuntu installer for example as a USB drive to QEMU. Mount an empty file (say 16GB) as a SATA HDD to be the target installation media to QEMU. Perform the Ubuntu installation. You have OVMF, modify it, build it and run with QEMU with the freshly installed Ubuntu on the 16GB file until you get the target result.

maheshtammisetti commented 1 year ago

@miczyg1 Thanks for the pointers, so far I was able to understand about the BmEnumerateBootOption, BmGetBootDescription functionality and how the code works along with the EfiBootManageInitializeLoadOption. I looked over the functionality of the handles and locating the protocol in order to check the EFI System Partition. I find it a bit harder to understand about the ConvertDeviceNodeToText, but I will be looking at the SPEC and code in order to understand the context and will revert back if there are any issues further.

maheshtammisetti commented 1 year ago

@miczyg1 Please check the latest PR, also I have blockers on the last part of the code, Fragment of the description that has to be taken from the harddrive and also how to add multiple files to look out , all of the below

\EFI\Microsoft\Boot\bootmgfw.efi - Windows Boot Manager (on ) \EFI\Suse\elilo.efi - Suse Boot Manager (on ) \EFI\Ubuntu\grubx64.efi Ubuntu (on ) \EFI\Ubuntu\shimx64.efi - Ubuntu (on ) \EFI\Redhat\elilo.efi - RedHat Boot Manager (on ) \EFI\Redhat\grubx64.efi - RedHat (on ) \EFI\Redhat\shimx64.efi - RedHat (on ) \EFI\Fedora\shimx64.efi - Fedora (on ) \EFI\Fedora\grub64.efi - Fedora (on ) \EFI\Centos\shimx64.efi - CentOS (on ) \EFI\Centos\grubx64.efi - CentOS (on ) \EFI\opensuse\grubx64.efi - OpenSuse (on ) \EFI\debian\grubx64.efi - Debian (on )

and also extracting the Ubuntu - <disk_name> from the BmGetBootDescription is a bit vague but I tried my best to implement in terms of code. Please review.

pietrushnic commented 1 year ago

@macpijan any chance we can automatically validate that using OSFV?

miczyg1 commented 1 year ago

<disk_name> is the disk name that would normally be created when /efi/boot/bootx64.efi was present on the disk.

For the same entries, prioritize shimx64.efi, if not found then grubx64.efi. If both not found, then do not create an entry.

miczyg1 commented 1 year ago

https://github.com/Dasharo/edk2/pull/70

BeataZdunczyk commented 10 months ago

Closing this issue as the changes have been merged. We are currently completing testing for the release, and further updates will be provided on testing issues https://github.com/Dasharo/dasharo-issues/issues/612 and https://github.com/Dasharo/dasharo-issues/issues/613.