rhboot / shim

UEFI shim loader
Other
856 stars 292 forks source link

Strange calls during network boot #649

Open raddirad opened 6 months ago

raddirad commented 6 months ago

Hi

I have a problem with shim 15.8 and a Dell Latitude 5300 2-in-1 Notebook. This Noteboot uses the latest Firmware 1.29 It is connected to Ethernet via a Thunderbolt USB-C Docking Station.

We do a lot of netbooting with a current shim 15.8. This shim is signed by Micrsooft, although the problem isn't Secureboot related.

When Netbooting on this specific machine we get a strange request via TFTP

RRQ from 192.168.16.97 filename loader/shimx64.efi.signed
tftp: client does not accept options
RRQ from 192.168.16.97 filename loader/shimx64.efi.signed
RRQ from 192.168.16.97 filename oader/revocations.efi
sending NAK (1, File not found) to 192.168.16.97
RRQ from 192.168.16.97 filename loader/?USB

Then the system fails and boots in a SupportAssist mode by Dell.

To verify it's not related to our shim i took the latest 15.8 shim from Canonical, with the same result.

Other systems, like Dell 5430 or vSphere or Proxmox VMs aren't affected. As for now this is the only system I know that has this issue

Other systems request the grub binary as expected after the revocations.efi is not found.

olifre commented 5 months ago

I do observe something similar with Dell Latitude 3590, OptiPlex 3040 and others. Checking wiht tcpdump, I see:

1318    08:31:26,348827        TFTP    84    Read Request, File: grub2/revocations.efi, Transfer type: octet, blksize=512
1319    08:31:26,352904        TFTP    61    Error Code, Code: File not found, Message: File not found
1320    08:31:28,673738        TFTP    77    Read Request, File: grub2/�Onboard, Transfer type: octet, blksize=512
1321    08:31:28,680057        UDP    61    45932 → caci-lm(1554) Len=19

The last packet seems incorrectly parsed by wireshark, it also contains the message "File not found" in the raw part.

It feels like some kind of bad memory access — "Onboard" is one of the EFI boot options on my end, probably the same holds true for "USB" in @raddirad s case. The strange character is 0xc2 in my case.

olifre commented 5 months ago

I've made some progress trying to understand the changes between 15.6. and 15.8.

Adding:

return EFI_SUCCESS;

right here: https://github.com/rhboot/shim/blob/14d63398298c8de23036a4cf61594108b7345863/load-options.c#L415 (i.e. after the special case handling several devices), things work again with my affected systems. Of course, that's not a real solution, but it highlights how the bad loader name appears.

So it seems that the secondary_loader, which is learnt from the load_options, contains some garbage on Dell systems (likely just the human-readable name of the network boot option instead of the actual loader).

Since the garbage does not start with \0, it is not ignored. The reason why it worked in the past is since shim had hardcoded the default loader to be used, i.e. grubx64.efi, which was fixed here: https://github.com/rhboot/shim/commit/a23e2f0de7a61b6e895a915676eba3a1fda2cd78 This leads to the bad value to be used instead of it being ignored. It's not yet fully clear to me how this bad character enters the options (UEFI bug?) and what would be the best way to ignore it (ignore if non-printable characters are seen?).

olifre commented 5 months ago

After enabling debug = 1 and rebuilding shim, I could grab this: shim_UEFI_PXE This appears to be the load_options one of our Dell systems provides, and it does not contain a file name, but the name of the option ("Onboard NIC (IPV4)") which is not really useful as secondary_loader. Furthermore, it is prefixed with a strange 0xc2 character.

Since I am not an expert in guessing which other things may break, I'm not sure about the best approach to fix this (ignore loaders starting with non-ASCII characters, for example?).

If a patch is developed (or there is consensus on how this should be handled), I can test it in my environment.

raddirad commented 5 months ago

maybe @julian-klode or @vathpela could take a look at this?

Thanks in advance

vathpela commented 5 months ago

That looks like a Boot#### variable that efibootmgr would display like this: * Onboard NIC(IPV4) PciRoot(0x0)/Pci(0x1c,0x0)/Pci(0x0,0x0)/MAC(d09466f5ac05,0)/IPv4(0.0.0.0,0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)..BO

I have absolutely no idea why this is being passed in the load options, but "is this a fully formed boot variable" is a thing that certainly could be tested for and ignored.

vathpela commented 5 months ago

Okay to be fair a little bit weird boot variable - the structure is like this psuedocode:

struct efi_load_option_s {
        uint32_t attributes;
        uint16_t file_path_list_length;
        uint16_t description[];  // NUL-terminated UCS-2 string
        uint8_t file_path_list[];
        uint8_t optional_data[];
};

So that's: 01 00 00 00 - attributes (EFI_VARIABLE_NON_VOLATILE) c2 00 - File path length (0xc2) 4f 00 6e 00 62 00 6f 00 61 00 72 00 64 00 20 00 4e 00 49 00 43 00 28 00 49 00 50 00 56 00 34 00 29 00 00 00 - description "Onboard NIC(IPV4)" Then 0x2a through 0xeb are the file path string, which is formed suspiciously. It starts with the device path in my previous comment, which goes from 0x3c to the 7f ff 04 00 at 0x82 which is the "end entire device path" marker. You would expect that would be the end except of course we've still got a lot of bytes left in our 0xc2 bytes of device path, and lo and behold there's just another device path there. It's a vendor specific message path, and it starts with some gibberish we can't decode, then another UCS-2 string that looks like a description of the ethernet port and the familiar 7f ff 04 00. No idea what the second device path is for at all. And then it ends in 00 00 42 4f, which is the (nonstandard) marker the boot services on this machine have crammed into the "optional data" to mark that it was created by the firmware.

vathpela commented 5 months ago

So in summary: 1) I have no idea why there's a boot variable hanging out here, 2) I have no idea why the device path list in the boot variable has this weird vendor device path, but 3) it is basically a reasonably well formed boot variable, and we could probably test for that, but I'd rather know why Dell is doing this, because it doesn't really seem like they should be.

pjwelsh commented 4 months ago

Was there a path chosen to help with this issue on the shim side? I have all Dell systems with this issue. My only choice at this time seems to be to downgrade shim-x64 to a 15.6 version.

pjwelsh commented 4 months ago

Also, I know it affects at least the Dell Optiplex 5040, 7040, 3060 and 5060 and Latitude 5400. For me, it's any Dell desktop or laptop I've needed to PXE install to so far.

nathan-omeara commented 4 months ago

Was there a path chosen to help with this issue on the shim side? I have all Dell systems with this issue. My only choice at this time seems to be to downgrade shim-x64 to a 15.6 version.

Have you tried going into the UEFI settings and in the 'boot sequence' section, unchecking the 'onboard nic' and 'usb' choices? You can still use f12 to choose a single-boot target of usb or network boot, but if you are permanently netbooting systems that won't work, obviously.

pjwelsh commented 4 months ago

I've not tried that path yet. Part of our use case is to also utilize the "wake on lan + PXE" BIOS option to auto install from an offline condition at a remote location. I'll need to check if that is still possible. Sadly the older Optiplex (older than

60) are not WMI capable and will physically need to have options changed.

The newer ones I can change BIOS via the SYS file system (/sys/devices/virtual/firmware-attributes/dell-wmi-sysman/) from the command line. I will not be at a location to check/test until next Wednesday, however. PJ

On Thu, May 23, 2024 at 8:24 AM nathan-omeara @.***> wrote:

Was there a path chosen to help with this issue on the shim side? I have all Dell systems with this issue. My only choice at this time seems to be to downgrade shim-x64 to a 15.6 version.

Have you tried going into the UEFI settings and in the 'boot sequence' section, unchecking the 'onboard nic' and 'usb' choices? You can still use f12 to choose a single-boot target of usb or network boot, but if you are permanently netbooting systems that won't work, obviously.

— Reply to this email directly, view it on GitHub https://github.com/rhboot/shim/issues/649#issuecomment-2127102877, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2ANMFFZQPZSHVA2HG2QLTZDXUZZAVCNFSM6AAAAABFM7P5UWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRXGEYDEOBXG4 . You are receiving this because you commented.Message ID: @.***>

--

Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived. Cassens

raddirad commented 4 months ago

Was there a path chosen to help with this issue on the shim side? I have all Dell systems with this issue. My only choice at this time seems to be to downgrade shim-x64 to a 15.6 version.

Have you tried going into the UEFI settings and in the 'boot sequence' section, unchecking the 'onboard nic' and 'usb' choices? You can still use f12 to choose a single-boot target of usb or network boot, but if you are permanently netbooting systems that won't work, obviously.

We are installing the OS via PXE/UEFI Netboot, and thus disabling boot choices is not an option. In addition the shim doesn't even try to load the grub via PXE/UEFI Netboot and hangs at the error described in OP. Last working version was 15.7

nathan-omeara commented 4 months ago

Was there a path chosen to help with this issue on the shim side? I have all Dell systems with this issue. My only choice at this time seems to be to downgrade shim-x64 to a 15.6 version.

Have you tried going into the UEFI settings and in the 'boot sequence' section, unchecking the 'onboard nic' and 'usb' choices? You can still use f12 to choose a single-boot target of usb or network boot, but if you are permanently netbooting systems that won't work, obviously.

We are installing the OS via PXE/UEFI Netboot, and thus disabling boot choices is not an option. In addition the shim doesn't even try to load the grub via PXE/UEFI Netboot and hangs at the error described in OP. Last working version was 15.7

Yes, if you are only installing the OS, you can press f12 to do a one-time boot to PXE, even when PXE is not in the 'boot sequence' list. That is how I have been able to work around this bug to install the OS via network boot.

raddirad commented 4 months ago

In our case we are loading the shim via PXE and this bug happens before the shim chainloads the grub via PXE.

nathan-omeara commented 4 months ago

In our case we are loading the shim via PXE and this bug happens before the shim chainloads the grub via PXE.

Yes, that is how this bug is occurring. I would still suggest you try the workaround. It isn't a great solution, but it seems to work, and still allows you to interactively network boot for OS install.

raddirad commented 4 months ago

Ok, now I get it. Yeah for me personally this is doable, but I can't tell our customers to this things if they have a lot of affected devices. This should be addressed by the shim team

olifre commented 4 months ago

Indeed, thanks for the proposed workaround, in fact in our case we reinstall nodes without user interaction (i.e. by triggering a PXE boot remotely, by adding it to the boot order temporarily, then rebooting), so this does not help with the many distributed desktop machines we operate.

nathan-omeara commented 4 months ago

https://github.com/rhboot/shim/commit/a23e2f0de7a61b6e895a915676eba3a1fda2cd78

This is the commit that introduces this issue. If I revert it, I can boot my dell (that I finally got hands-on with) with the Onboard devices still in the boot sequence.

So, I'm guessing this is getting confused by the weird Dell boot entries, and screwing up the load path for grubx64.efi

nathan-omeara commented 4 months ago

Possible fix: https://github.com/rhboot/shim/blob/0287c6b14c77eeb3e3c61996330850d43d937a2b/shim.c#L1262-L1263

Add TFTP_ERROR here:

        if (!use_fb && (efi_status == EFI_INVALID_PARAMETER ||
                        efi_status == EFI_NOT_FOUND ||
                        efi_status == EFI_TFTP_ERROR)) {

In my testing, this gets it booting over network again.

pjwelsh commented 4 months ago

Any guess as to how long a change like may take to make it into a updated release package?

raddirad commented 4 months ago

maybe @vathpela @jsetje or @julian-klode could say more on if this might get upstream

jsetje commented 3 months ago

Thank you for getting my attention. Just testing for the extra error is probably reasonable, but I'm also curious why we get a variable that looks like that. Since I exposed this, I'll certainly help get a fix in.

pjwelsh commented 3 months ago

All of your help is much appreciated! Thank you for helping to resolve this issue. PJ

On Tue, Jun 4, 2024 at 8:12 PM Jan Setje-Eilers @.***> wrote:

Thank you for getting my attention. Just testing for the extra error is probably reasonable, but I'm also curious why we get a variable that looks like that. Since I exposed this, I'll certainly help get a fix in.

— Reply to this email directly, view it on GitHub https://github.com/rhboot/shim/issues/649#issuecomment-2148675162, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2ANMDP55MFP6GTPQJP7WTZFZQZJAVCNFSM6AAAAABFM7P5UWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBYGY3TKMJWGI . You are receiving this because you commented.Message ID: @.***>

--

Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived. Cassens

jsetje commented 3 months ago

I started asking around to see if I could find a system to test this with, which made me wonder about a 2-in-1 with a built in NIC. So I looked at the original report again. I bet that this all has something to do with how the docking station brings the NIC in.

nathan-omeara commented 3 months ago

It definitely isn't specific to docking stations.

I have one dell on-hand with the issue, a Latitude 5300 (built-in NIC, no external NIC). But I also have one other dell, one HP, and one MS surface that do not show this issue, all 3 of those using USB NICs only.

It's worth noting that USB boot technically has the same basic issue, but the fallback code kicks in on USB boot, because the error handling I pointed out handles the error when it's on a filesystem, it just doesn't handle it when it's on TFTP.

I also wonder if HTTP(s) boot would be another error code that would need to be added there, but my only on-hand device with HTTP(s) boot support is the dell that doesn't demonstrate this issue.

If you pay attention on USB boot, you can see the same error, followed by the message here: https://github.com/rhboot/shim/blob/0287c6b14c77eeb3e3c61996330850d43d937a2b/shim.c#L1265

This is what led me to try adding EFI_TFTP_ERROR to that statement.

nathan-omeara commented 3 months ago

Hmm, yeah, forced a (similar?) error by renaming grubx64.efi on my http boot server and booting my other dell. start_image() returned 00000023

I'm guessing because 0x23 (35?) is relatively new, and I'm using fedora's shipping version of shim 15.8 which was probably compiled with an earlier version of gnu_efi.

So I'd suggest adding EFI_TFTP_ERROR and EFI_HTTP_ERROR to that fallback logic.

I certainly wouldn't object to fixing the parsing of the weird values (if there is an actual issue, and it isn't just Dell and Lenovo (and maybe others) doing something that breaks the standard) but harmonizing the fallback behavior between local filesystems and network boot makes sense to me.

jsetje commented 3 months ago

FWIW, we'll have to fix this forward. In addition to the patch that exposed this, we'll need non-hardcoded paths and names for UKIs. Hopefully I can get my hands on a setup that exposes this, but I'm also not opposed to keep trying unless we get a very specific error.

raddirad commented 3 months ago

I started asking around to see if I could find a system to test this with, which made me wonder about a 2-in-1 with a built in NIC. So I looked at the original report again. I bet that this all has something to do with how the docking station brings the NIC in.

This is not related to the Dock. I tested the 2-in-1 and a working Dell device. The 2-in-1 failed, the other one succeeded.

@olifre mentioned other devices that show the same behaviour ("Dell Latitude 3590, OptiPlex 3040 and others") Maybe @olifre can post the other ones and you might get access to one of those

raddirad commented 3 months ago

FWIW, we'll have to fix this forward. In addition to the patch that exposed this, we'll need non-hardcoded paths and names for UKIs. Hopefully I can get my hands on a setup that exposes this, but I'm also not opposed to keep trying unless we get a very specific error.

If you want anything tested, I have access to the 2-in-1 convertible I mention in OP. I can test new code

olifre commented 3 months ago

Maybe @olifre can post the other ones and you might get access to one of those

I can immediately add to the list:

After that, I stopped doing systematic testing, as testing other models (we have an assortment of Dell OptiPlex systems, but no other Latitudes at hand) would mean temporarily stealing them from active users to test them in our test network.

I can certainly try to grab a specific model if you know you can get a hand on any OptiPlex, check it and report back here.

Combining my list with the information provided by @pjwelsh above, I think the full known Dell list is:

From those numbers, it seems quite likely all the OptiPlex _020 to _080 are affected (at least).

nathan-omeara commented 3 months ago

I am also able to test proposed patches. I even set up an additional signing key on the latitude 5300 so I can sign my builds and test with secure boot on.

nathan-omeara commented 3 months ago

I was going to submit a PR with the changes I recommended above, but it won't compile with EFI_HTTP_ERROR without updating the submodule branch for gnu-efi, and it looks like that's more complicated than I had assumed.

pjwelsh commented 3 months ago

Any progress on the PR submission?

nathan-omeara commented 3 months ago

I could submit it without EFI_HTTP_ERROR until gnu-efi is updated. I'm not sure what you guys need to do to pull in a newer version of gnu-efi.

pjwelsh commented 3 months ago

Not even sure about gnu-efi... I haven't found it as an RPM package or a file so far.

On Thu, Jun 20, 2024 at 10:53 AM nathan-omeara @.***> wrote:

I could submit it without EFI_HTTP_ERROR until gnu-efi is updated. I'm not sure what you guys need to do to pull in a newer version of gnu-efi.

— Reply to this email directly, view it on GitHub https://github.com/rhboot/shim/issues/649#issuecomment-2181032376, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2ANMCFQZKRSGL4YDMWEPDZIL3GRAVCNFSM6AAAAABFM7P5UWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBRGAZTEMZXGY . You are receiving this because you were mentioned.Message ID: @.***>

--

Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived. Cassens

nathan-omeara commented 3 months ago

It's a submodule, and it looks like each release of shim has a specific gnu-efi revision/branch tagged: https://github.com/rhboot/shim/blob/0287c6b14c77eeb3e3c61996330850d43d937a2b/.gitmodules#L1-L4

That gnu-efi shim-15.8 branch does not contain EFI_HTTP_ERROR yet, but the main branch has that.

olifre commented 3 months ago

For reference, I can confirm that PR #666 indeed fixes the issue for our machines when I recompile shim with that patch (we are using TFTP boot). Many thanks!

raddirad commented 3 months ago

I can also confirm, that adding efi_status == EFI_TFTP_ERROR fixed the error on the Dell 5300 2-in-1 i mentioned in OP

nathan-omeara commented 3 months ago

@jsetje @vathpela - Any thoughts on #666

The fix works in my test system, and the two commenters above.

raddirad commented 2 months ago

ping @jsetje @vathpela

is there any update on https://github.com/rhboot/shim/pull/666 ?

dbnicholson commented 3 weeks ago

@vathpela So, if I interpret this correctly, it's really an issue with parse_load_options setting second_stage to a garbage value, right? If it had been left alone, you'd still have grub as the second stage.

However, even if the second stage had been set correctly, you might get an error from the server if it doesn't have the file. Perhaps the right thing to do is to translate the fetch errors to EFI_NOT_FOUND so the normal fallback to the default second stage happens. Something like:

diff --git a/shim.c b/shim.c
index 87202f7..b889439 100644
--- a/shim.c
+++ b/shim.c
@@ -1132,6 +1137,10 @@ EFI_STATUS read_image(EFI_HANDLE image_handle, CHAR16 *ImagePath,
        if (EFI_ERROR(efi_status)) {
            perror(L"Unable to fetch TFTP image: %r\n",
                   efi_status);
+           // Treat errors returned by the TFTP server like
+           // a missing file.
+           if (efi_status == EFI_TFTP_ERROR)
+               efi_status = EFI_NOT_FOUND;
            return efi_status;
        }
        *data = sourcebuffer;
@@ -1145,6 +1154,10 @@ EFI_STATUS read_image(EFI_HANDLE image_handle, CHAR16 *ImagePath,
        if (EFI_ERROR(efi_status)) {
            perror(L"Unable to fetch HTTP image %a: %r\n",
                   netbootname, efi_status);
+           // Treat errors returned by the HTTP server like
+           // a missing file.
+           if (efi_status == EFI_HTTP_ERROR)
+               efi_status = EFI_NOT_FOUND;
            return efi_status;
        }
        *data = sourcebuffer;

I'm not in this scenario, but that seems like where things go wrong. I think you'd still want to improve the load option parsing, but failure to fetch the image from a server should be treated the same way as failure to open a file on disk. WDYT?

MarkusSpier commented 6 days ago

We also faced the same problem that Dell Systems wont boot with the shim 15.8.

But we found a stupid workaround. Just copy your Grub.efi file that should be loaded and rename it to êonboard and the system will load your grub!

I think, this is a good way until a new shim with a fix is here.

raddirad commented 6 days ago

It's not stupid if it works. However I have seen devices requesting different names �USB for example.

MarkusSpier commented 6 days ago

For our two Dell-Test clients (both Optiplex Systems, one is a Touch all in one) it works. Maybe you can also use the Filename êusb for the USB-Szenario.

dbnicholson commented 6 days ago

Could either of you dump out the raw boot option data and attach it here? I'd like to poke at it in code instead of trying to interpret the hexdump in my head. You can just copy the appropriate /sys/firmware/efi/efivars/BootXXXX-8be4df61-93ca-11d2-aa0d-00e098032b8c file corresponding to the right boot option. Look at the output of efibootmgr to see which on it is. You could also base64 encode it like base64 /sys/firmware/efi/efivars/BootXXXX-8be4df61-93ca-11d2-aa0d-00e098032b8c > bootopt.b64 and upload that.

raddirad commented 16 hours ago

So I did this on an OptiPlex 3050

efibootmgr -v
BootCurrent: 0012
Timeout: 2 seconds
BootOrder: 0013,0014,0015,0016,0017,0012,000A,000A,0012,0019
Boot0000* Windows Boot Manager  HD(1,GPT,cc9ed1a0-b28c-4713-9cbf-a3af67ae85d0,0x800,0xfa000)/File(\EFI\Microsoft\Boot\bootmgfw.efi)WINDOWS.........x...B.C.D.O.B.J.E.C.T.=.{.9.d.e.a.8.6.2.c.-.5.c.d.d.-.4.e.7.0.-.a.c.c.1.-.f.3.2.b.3.4.4.d.4.7.9.5.}....................
Boot000A  Windows Boot Manager  VenHw(99e275e7-75a0-4b37-a2e6-c5385e6c00cb)
Boot000B  Diskette Drive    BBS(Floppy,Diskette Drive,0x0)..BO
Boot000D  USB Storage Device    BBS(USB,USB Storage Device,0x0)..BO
Boot000E  CD/DVD/CD-RW Drive    BBS(CDROM,CD/DVD/CD-RW Drive,0x0)..BO
Boot000F* Onboard NIC   BBS(Network,Realtek PXE B01 D00,0x0)..BO
Boot0010* P0: Samsung SSD 850 PRO 128GB BBS(HD,P0: Samsung SSD 850 PRO 128GB,0x0)..BO
Boot0012* Onboard NIC(IPV4) PciRoot(0x0)/Pci(0x1c,0x0)/Pci(0x0,0x0)/MAC(d89ef37f465b,0)/IPv4(0.0.0.00.0.0.0,0,0)..BO
Boot0013* Diskette Drive    BBS(Floppy,Diskette Drive,0x0)..BO
Boot0014* Internal HDD  BBS(HD,Internal HDD,0x0)..BO
Boot0015* USB Storage Device    BBS(USB,USB Storage Device,0x0)..BO
Boot0016* CD/DVD/CD-RW Drive    BBS(CDROM,CD/DVD/CD-RW Drive,0x0)..BO
Boot0017* Onboard NIC   BBS(Network,Realtek PXE B01 D00,0x0)..BO
Boot0019* Onboard NIC(IPV6) PciRoot(0x0)/Pci(0x1c,0x0)/Pci(0x0,0x0)/MAC(d89ef37f465b,0)/IPv6([::]:<->[::]:,0,0)..BO

There are different devices with an Onboard Prefix. So here are all of then

Boot000F* Onboard NIC

base64 /sys/firmware/efi/efivars/Boot000F-8be4df61-93ca-11d2-aa0d-00e098032b8c 
BwAAAAEAAAB+AE8AbgBiAG8AYQByAGQAIABOAEkAQwAAAAUBHAAGAAAAUmVhbHRlayBQWEUgQjAx
IEQwMAB//wQAAQQaAK6EsR31gXJOhUQrqwwsrFwBAAACAAB//wQAAQQ8AO9HZC3JO6BBrBlNUdAb
TOZSAGUAYQBsAHQAZQBrACAAUABYAEUAIABCADAAMQAgAEQAMAAwAAAAf/8EAAAAQk8=

Boot0012* Onboard NIC(IPV4)

base64 /sys/firmware/efi/efivars/Boot0012-8be4df61-93ca-11d2-aa0d-00e098032b8c 
BwAAAAEAAADCAE8AbgBiAG8AYQByAGQAIABOAEkAQwAoAEkAUABWADQAKQAAAAIBDADQQQMKAAAA
AAEBBgAAHAEBBgAAAAMLJQDYnvN/RlsAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADDBsAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAB//wQAAQRiAO9HZC3JO6BBrBlNUdAbTOZJAFAANAAgAFIAZQBh
AGwAdABlAGsAIABQAEMASQBlACAARwBCAEUAIABGAGEAbQBpAGwAeQAgAEMAbwBuAHQAcgBvAGwA
bABlAHIAAAB//wQAAABCTw==

Boot0017* Onboard NIC

base64 /sys/firmware/efi/efivars/Boot0017-8be4df61-93ca-11d2-aa0d-00e098032b8c 
BwAAAAEAAAB+AE8AbgBiAG8AYQByAGQAIABOAEkAQwAAAAUBHAAGAAAAUmVhbHRlayBQWEUgQjAx
IEQwMAB//wQAAQQaAK6EsR31gXJOhUQrqwwsrFwBAAACAAB//wQAAQQ8AO9HZC3JO6BBrBlNUdAb
TOZSAGUAYQBsAHQAZQBrACAAUABYAEUAIABCADAAMQAgAEQAMAAwAAAAf/8EAAAAQk8=

Boot0019* Onboard NIC(IPV6)

base64 /sys/firmware/efi/efivars/Boot0019-8be4df61-93ca-11d2-aa0d-00e098032b8c 
BwAAAAEAAADjAE8AbgBiAG8AYQByAGQAIABOAEkAQwAoAEkAUABWADYAKQAAAAIBDADQQQMKAAAA
AAEBBgAAHAEBBgAAAAMLJQDYnvN/RlsAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADDTwAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQAAAAAAAAAAAAAAAAAAAAAB//wQA
AQRiAO9HZC3JO6BBrBlNUdAbTOZJAFAANgAgAFIAZQBhAGwAdABlAGsAIABQAEMASQBlACAARwBC
AEUAIABGAGEAbQBpAGwAeQAgAEMAbwBuAHQAcgBvAGwAbABlAHIAAAB//wQAAABCTw==
nathan-omeara commented 11 hours ago

And to add some data points, the two entries matching "Onboard" on my Lattitude 5300: Boot0003* Onboard NIC(IPV4):

BwAAAAEAAADQAE8AbgBiAG8AYQByAGQAIABOAEkAQwAoAEkAUABWADQAKQAAAAIBDADQQQMKAAAA
AAEBBgAGHwMLJQAs6n8Kn2kAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADDBsAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAB//wQAAQR2AO9HZC3JO6BBrBlNUdAbTOZQAFgARQAgAEkAUAA0ACAASQBu
AHQAZQBsACgAUgApACAARQB0AGgAZQByAG4AZQB0ACAAQwBvAG4AbgBlAGMAdABpAG8AbgAgACgA
NgApACAASQAyADEAOQAtAEwATQAAAH//BAAAAEJP

Boot0004* Onboard NIC(IPV6)

BwAAAAEAAADxAE8AbgBiAG8AYQByAGQAIABOAEkAQwAoAEkAUABWADYAKQAAAAIBDADQQQMKAAAA
AAEBBgAGHwMLJQAs6n8Kn2kAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADDTwAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQAAAAAAAAAAAAAAAAAAAAAB//wQAAQR2AO9H
ZC3JO6BBrBlNUdAbTOZQAFgARQAgAEkAUAA2ACAASQBuAHQAZQBsACgAUgApACAARQB0AGgAZQBy
AG4AZQB0ACAAQwBvAG4AbgBlAGMAdABpAG8AbgAgACgANgApACAASQAyADEAOQAtAEwATQAAAH//
BAAAAEJP

When this device is encountering this error, the boot filename sent in the TFTP request (that should be grubx64.efi) is ÐOnboard (where 0xd0 precedes "Onboard"), and I notice that in the hex representation of that var, on the 5300, 0xd000 precedes the utf-16-le encoding of "Onboard NIC(IPV4)".

This is readily visible in a packet capture of the TFTP download request.

That is slightly different from the Optiplex example above, so I wonder if the TFTP file request in @raddirad 's example would have 0xc2 in front of "Onboard" in the TFTP request?

nathan-omeara commented 11 hours ago

Per this comment https://github.com/rhboot/shim/issues/649#issuecomment-2093666262, it seems that the d0/c2/etc are the length of the boot option's file_path_list[] entry.

It's possible that the length in my example just happens to match what is sent in the filename.. so I would be curious to see if it's different with different lengths.

raddirad commented 10 hours ago

@nathan-omeara could you tell me how to get those hex values. I would like to provide infos.

nathan-omeara commented 10 hours ago
image

Adding a small example of what I see in wireshark.

I just run wireshark on my TFTP server, and filter the displayed packets to tftp (I also include dhcp just to help with some troubleshooting):

image

Edit: And to be clear, all you have to do is click on the 'source file' in the packet dissector pane to highlight the exact bytes in the hex dump pane.

If you instead wanted to capture using TCPDump and provide the raw dump file I or someone else could load it into wireshark and look, but there's a risk of capturing other sensitive data from your network that way. (Though you could probably lower the chances of that by only capturing packets with a destination of udp port 69)