pbatard / EfiFs

EFI FileSystem drivers
https://efi.akeo.ie
GNU General Public License v3.0
506 stars 77 forks source link

Unable to change to filesystem under recent EFI Shell version #15

Closed elFarto closed 4 years ago

elFarto commented 4 years ago

I seem to have an odd issue where recent versions of the EFI Shell are unable to switch to an ext4 filesystem: Screenshot_lfs-vm_2019-12-07_09:41:12 The logging above the "fs0:" command is the output of a "dir fs0:\" command, so the file system is working correctly.

With an earlier version of the shell (under VirtualBox, rather than KVM/Qemu) it can switch to the file system with no issue. I've also run the new version of shell under VirtualBox, and it exhibits the same issue. However, only the new version prints out those GetInfo lines, older versions print nothing when switching file systems, so it's clearly looking for something and not finding it.

I've tried tracing through where the EFI Shell source code, but I can't seem to find any obvious cause of this issue (I'm not familiar with the EFI source code).

pbatard commented 4 years ago

Can you please at least provide the exact version of UEFI Shell where you see that issue occurring?

With regards to investigating this, I guess I'm going to have to wait until there are versions of OVMF that can be used with QEMU, that use a version of the UEFI Shell equal or above the one you seem to have an issue with, so that I can try to replicate the problem (I'm afraid that, even if it looks like an interesting exercise, I just don't have time to investigate build my own version of OVMF at the moment). If this is really an issue with recent UEFI Shells, it should be easy to replicate in the QEMU tests we have, provided we have an up to date OVMF firmware...

elFarto commented 4 years ago

Sure, the ver command reports:

UEFI Interactive Shell v2.2
EDK II
UEFI v2.70 (EDK II, 0x00010000)

I believe the Shell is from the edk2-stable201908 tag on the tianocore/edk2 GitHub project, but it's been a while.

I just retested with the latest Shell build (edk2-stable201911), and it's still has the same issue: can't switch to the filesystem, but it can list the contents of it with dir fs0:.

pbatard commented 4 years ago

Thanks. I think I'll try to replace the Shell module from the OVMF I have with a recent version to see what happens. Not sure when I'll get a chance to do that though...

no92 commented 4 years ago

This affects me as well, with the same UEFI Shell version. My OVMF is the latest from https://www.kraxel.org/repos

pbatard commented 4 years ago

I've been able to replicate the issue using usr\share\edk2.git\ovmf-x64\OVMF_CODE-pure-efi.fd from https://www.kraxel.org/repos/jenkins/edk2/edk2.git-ovmf-x64-0-20191118.1335.g1d3215fd24.noarch.rpm

Here is what happens with Shell 2.2 / UEFI 2.70:

BdsDxe: failed to load Boot0001 "UEFI QEMU HARDDISK QM00001 " from PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0): Not Found
BdsDxe: failed to load Boot0002 "UEFI QEMU HARDDISK QM00002 " from PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Slave,0x0): Not Found
BdsDxe: loading Boot0003 "EFI Internal Shell" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(7C04A583-9E3E-4F1C-AD65-E05268D0B4D1)
BdsDxe: starting Boot0003 "EFI Internal Shell" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(7C04A583-9E3E-4F1C-AD65-E05268D0B4D1)
UEFI Interactive Shell v2.2
EDK II
UEFI v2.70 (EDK II, 0x00010000)
Mapping table
      FS0: Alias(s):HD0a1:;BLK1:
          PciRoot(0x0)/Pci(0x1,0x1)/Ata(0x0)/HD(1,MBR,0xBE1AFDFA,0x3F,0xFBFC1)
     BLK0: Alias(s):
          PciRoot(0x0)/Pci(0x1,0x1)/Ata(0x0)
     BLK2: Alias(s):
          PciRoot(0x0)/Pci(0x1,0x1)/Ata(0x0)
Press ESC in 5 seconds to skip startup.nsh or any other key to continue.
Shell> set FS_LOGGING 4
Shell> load fs0:\ext2_x64.efi
FS driver installed.
Image 'FS0:\ext2_x64.efi' loaded at 683C000 - Success
FSBindingSupported
FSBindingStart
FSInstall: PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Slave,0x0)
Shell> map -r
Mapping table
      FS0: Alias(s):HD0a1:;BLK1:
          PciRoot(0x0)/Pci(0x1,0x1)/Ata(0x0)/HD(1,MBR,0xBE1AFDFA,0x3F,0xFBFC1)
      FS1: Alias(s):F0b:;BLK2:
          PciRoot(0x0)/Pci(0x1,0x1)/Ata(0x0)
     BLK0: Alias(s):
          PciRoot(0x0)/Pci(0x1,0x1)/Ata(0x0)
Shell> dir fs1:
OpenVolume
Open(6B1CA98 <ROOT>, "")
  Reopening <ROOT>
  RET: 6B1CA98
Close(6B1CA98|'/') <ROOT>
GetInfo(6B1CA98|'/', 0) <DIR>
GetInfo(6B1CA98|'/', 600) <DIR>
Close(6B1CA98|'/') <ROOT>
OpenVolume
Open(6B1CA98 <ROOT>, ".")
  Reopening <ROOT>
  RET: 6B1CA98
GetInfo(6B1CA98|'/', 0) <DIR>
GetInfo(6B1CA98|'/', 600) <DIR>
Open(6B1CA98 <ROOT>, "..")
Trying to open <ROOT>'s parent
Close(6B1CA98|'/') <ROOT>
GetInfo(6B1CA98|'/', 0) <DIR>
GetInfo(6B1CA98|'/', 600) <DIR>
SetPosition(6B1CA98|'/', 0) <DIR>
Read(6B1CA98|'/', 602) <DIR>
Read(6B1CA98|'/', 602) <DIR>
Read(6B1CA98|'/', 602) <DIR>
Close(6B1CA98|'/') <ROOT>
OpenVolume
Open(6B1CA98 <ROOT>, "\lost+found")
  RET: 6898D18
Close(6B1CA98|'/') <ROOT>
OpenVolume
Open(6B1CA98 <ROOT>, "\EFI")
  RET: 6898298
Close(6B1CA98|'/') <ROOT>
Directory of: fs1:\
12/12/2014  17:42 <DIR> r           0  lost+found
12/12/2014  17:43 <DIR> r           0  EFI
          0 File(s)           0 bytes
          2 Dir(s)
Close(6898D18|'/lost+found') 
Close(6898298|'/EFI') 
Shell> fs1:
OpenVolume
GetInfo(6B1C598|'/', 0) <DIR>
GetInfo(6B1C598|'/', 600) <DIR>
Close(6B1C598|'/') <ROOT>
Shell> 

And this is what happens with Shell 2.0 / UEFI 2.40:

Boot Failed. EFI Floppy
Boot Failed. EFI Floppy 1
Boot Failed. EFI Hard Drive
Boot Failed. EFI Hard Drive 1
UEFI Interactive Shell v2.0
EDK II
UEFI v2.40 (EDK II, 0x00010000)
Mapping table
      FS0: Alias(s):HD7a1:;BLK3:
          PciRoot(0x0)/Pci(0x1,0x1)/Ata(0x0)/HD(1,MBR,0xBE1AFDFA,0x3F,0xFBFC1)
     BLK2: Alias(s):
          PciRoot(0x0)/Pci(0x1,0x1)/Ata(0x0)
     BLK4: Alias(s):
          PciRoot(0x0)/Pci(0x1,0x1)/Ata(0x0)
     BLK0: Alias(s):
          PciRoot(0x0)/Pci(0x1,0x0)/Floppy(0x0)
     BLK1: Alias(s):
          PciRoot(0x0)/Pci(0x1,0x0)/Floppy(0x1)
Press ESC in 5 seconds to skip startup.nsh or any other key to continue.
Shell> set FS_LOGGING 4
Shell> load fs0:\ext2_x64.efi
FS driver installed.
Image 'FS0:\ext2_x64.efi' loaded at 66B6000 - Success
FSBindingSupported
FSBindingStart
Could not read block at address 00000002: [12] No Media
error: Could not read block.
FSBindingSupported
FSBindingStart
Could not read block at address 00000002: [12] No Media
error: Could not read block.
FSBindingSupported
FSBindingStart
FSInstall: PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Slave,0x0)
FSBindingSupported
FSBindingStart
Could not read block at address 00000002: [12] No Media
error: Could not read block.
FSBindingSupported
FSBindingStart
Could not read block at address 00000002: [12] No Media
error: Could not read block.
FSBindingSupported
FSBindingStart
Could not read block at address 00000002: [12] No Media
error: Could not read block.
FSBindingSupported
FSBindingStart
Could not read block at address 00000002: [12] No Media
error: Could not read block.
Shell> map -r
Mapping table
      FS0: Alias(s):HD7a1:;BLK3:
          PciRoot(0x0)/Pci(0x1,0x1)/Ata(0x0)/HD(1,MBR,0xBE1AFDFA,0x3F,0xFBFC1)
      FS1: Alias(s):F7b:;BLK4:
          PciRoot(0x0)/Pci(0x1,0x1)/Ata(0x0)
     BLK2: Alias(s):
          PciRoot(0x0)/Pci(0x1,0x1)/Ata(0x0)
     BLK0: Alias(s):
          PciRoot(0x0)/Pci(0x1,0x0)/Floppy(0x0)
     BLK1: Alias(s):
          PciRoot(0x0)/Pci(0x1,0x0)/Floppy(0x1)
Shell> dir fs1:
OpenVolume
Open(6EB9F18 <ROOT>, "")
  Reopening <ROOT>
  RET: 6EB9F18
Close(6EB9F18|'/') <ROOT>
GetInfo(6EB9F18|'/', 0) <DIR>
GetInfo(6EB9F18|'/', 600) <DIR>
Close(6EB9F18|'/') <ROOT>
Directory of: fs1:\
OpenVolume
Open(6EB9F18 <ROOT>, ".")
  Reopening <ROOT>
  RET: 6EB9F18
GetInfo(6EB9F18|'/', 0) <DIR>
GetInfo(6EB9F18|'/', 600) <DIR>
Close(6EB9F18|'/') <ROOT>
GetInfo(6EB9F18|'/', 0) <DIR>
GetInfo(6EB9F18|'/', 600) <DIR>
SetPosition(6EB9F18|'/', 0) <DIR>
Read(6EB9F18|'/', 602) <DIR>
Read(6EB9F18|'/', 602) <DIR>
Read(6EB9F18|'/', 602) <DIR>
OpenVolume
Open(6EB9F18 <ROOT>, "\lost+found")
  RET: 6EB9918
Close(6EB9F18|'/') <ROOT>
OpenVolume
Open(6EB9F18 <ROOT>, "\EFI")
  RET: 6EB9518
Close(6EB9F18|'/') <ROOT>
12/12/2014  17:42 <DIR> r           0  lost+found
12/12/2014  17:43 <DIR> r           0  EFI
          0 File(s)           0 bytes
          2 Dir(s)
Close(6EB9918|'/lost+found') 
Close(6EB9518|'/EFI') 
Shell> fs1:
FS1:\> 

Apart from the irrelevant entries related to poking non-existent floppy drives (it would be nice if QEMU's -nodefaults actually removed the floppy controllers as advertised, because old UEFI certainly doesn't seem to see it that way), things are pretty similar, and we do see the relevant BLK device with the ext4 file system being remapped as FS1:.

The one weird thing that I'm seeing in Shell 2.2 is that, when listing the content of FS1:, there is a call to look at the parent directory for /:

Open(6B1CA98 <ROOT>, "..")
Trying to open <ROOT>'s parent

But that doesn't interfere with anything, and since we don't see that call when trying to cd to the drive, I doubt this is relevant. It's just an extra call that was added between Shell 2.0 and Shell 2.2. And of course, you can still access the files as normal by using their full path instead of cd'ing to the drive...

So it's really the cding to a drive that seems to have changed between 2.0 and 2.2 (since we are getting debug messages that we weren't seeing before), and that introduced some kind of incompatibility.

Because efifs is not a priority (sorry) and you can still use the filesystem as long as you don't try to cd to it it'll probably be a while (read: months) before I start to look into this in earnest...

pbatard commented 4 years ago

One thing I will point out however is that this issue seems to happens regardless of the file system being used. In other words, it's not tied to using ext4 as the file system. For instance, I got the same thing when testing Shell 2.2/UEFI 2.70 against NTFS and exFAT.

pbatard commented 4 years ago

Okay, I think I'm finally starting to see a bit clearer about this.

the problem seems to be the consequence of this code that has been added to recent versions of the Shell and that calls ShellFileExists() on the drive (e.g. ShellFileExists("FS1:")) before setting the drive in the shell.

That code is the reason we are seeing additional output with OpenVolume and stuff in DEBUG mode, because ShellFileExists() is now trying to access the root of the volume to validate that it exists before allowing to switch to the drive.

From what I can see however, this ShellFileExists() is returning Error: Not Found when querying whether a root drive like FS1: exists, and that is why the prompt is never set to FS1: by the Shell.

The thing however is that the calls the Shell makes into GetInfo() do return EFI_SUCCESS, so it looks like the Shell is not finding what it's looking for in the data we return, and I need to investigate a bit further into the code for ShellOpenFileMetaArg() (which is what ShellFileExists() calls behind the scenes). Maybe the issue is something as stupid as the Shell wanting to see a / for the root directory filename returned by GetInfo(), whereas we are returning an empty string or something...

pbatard commented 4 years ago

After extensive jumps through the UEFI Shell rabbit holes, I finally got to the root cause of the issue.

It all boils down to a missing initialization of the size of the Info struct after we zeroed it in FileGetInfo().

Because of this Info->Size is effectively 0 when we compute tmpLen resulting in an erroneously large value of tmpLen.

Now, outside of the buffer overflow potential (which is unlikely to be triggered in most cases since we force callers to always allocate a buffer that can hold a path as large as 256 chars), this first bug does not usually result in a major problem because tmpLen gets adjusted in the next call to Utf8ToUtf16NoAllocUpdateLen(), so eventually, we end up populating Info->Size to the value we want... But of course, when the code is buggy in the first place, you can't expect to get lucky all the time: In the case where the string we convert is the empty string (""), which is what we are dealing with when processing the root directory of FS1:, then because there was a second bug in Utf8ToUtf16NoAllocUpdateLen(), we actually end up not updating the length to the one expected for the empty string...

Thus, in that case, the end result is that Info->Size will be set to (unsigned)-1 before we exit, which of course doesn't sit too well with the Shell.

Once you fix the 2 bugs above, then the UEFI Shell happily switches to `FS1:" as expected...

Oh and while I was at it, I also found that we didn't actually allocate the returned string when converting the empty string in the Utf#ToUtf#Alloc() calls in utf8.c, which isn't too great either. Should have spent a bit more time validating that corner cases worked.

So that's 3 relatively major bugs identified as the result of this report, which is a very good thing, so I can't thank you enough for reporting it!

I need to clean my patches a little bit, then I will push a commit that automatically closes this issue.

After that, I'll have to see how soon I can publish v1.4 of the EfiFs drivers, that includes this fix.

elFarto commented 4 years ago

Wow, that's excellent work! Glad you managed to get to the bottom of it.

pbatard commented 4 years ago

I've just updated the drivers to v1.4 on the official website, so you should be able to download and test that the issue is fixed.

Please visit https://efi.akeo.ie/ for the downloads.

Real-Jogyi commented 4 years ago

Got the same issue usiing a Hybrid-boot iso image to boot a VM on VirtualBox v6.1.6r137129 (this is UEFI Shell 2.0 / EDK II / UEFI v2.70):

BDsDxe: failed to load Boot0001 "UEFI VBOX CDROM VB0 1a2b3c4d " from PciRoot(0x0)(Pci(0x1F,0x2)/Sata(0x0,0xFFFF,0x0): Not found

Unfortunately this is not resolved by loading the drivers iso9660_x64.efi/jfs_x64 from https://efi.akeo.ie. the CDROM FSx: entry is still missing after "map -r" I'm not yet very familiar with UEFU usage - do you have any idea?

pbatard commented 4 years ago

the CDROM FSx: entry is still missing after "map -r"

Most likely because this has nothing to do with this issue (which I am positive it doesn't, because the cause of this specific issue was identified and fixed) and you are simply not using the right file system driver to access your CD-ROM device.

Can you tell what file system is actually being used by that CD/DVD image? Please be aware that ISO9660 is not the only file system that can be used for an optical disc (for instance your disc might be UDF only), so if you load the wrong file system, of course map -r will fail. And since you are using VirtualBox are you sure the disc image is properly mounted? Finally, the fact that you mention that it's an ISoHybrid leads me to think that it could have been improperly mastered. Did you create that ISOHybrid yourself? If not where can it be downloaded?

Real-Jogyi commented 4 years ago

Filesystem is a Paragon Disk Manager v17 recovery medium with iso9660 level 4 + Joliet fs, dual boot (BIOS+UEFI). Remastered after customizing with mkisofs v3.02a09 and the following parameters: mkisofs -r -V "PARAGON17" -cache-inodes -J -l -iso-level 4 \ -b boot/x86_64/loader/isolinux.bin -no-emul-boot -boot-load-size 4 \ -c boot/x86_64/loader/boot.cat -boot-info-table \ -eltorito-alt-boot \ -eltorito-platform "efi" -b EFI/BOOT/BOOTx64.EFI -no-emul-boot \ -o "$DEST/$DISO" . I lack of any ideas what could be wrong there caus 1. the pen stick generated from that iso boots correct and 2. could'nt get out either the parameters nor the software Paragon media builder is using during its remastering. With regard to the mounting in VB I think it's properly (cause switching of the EFI support leads to a correct BIOS boot).

pbatard commented 4 years ago

Remastered after customizing

What happens with the non customized version?

the pen stick generated from that iso boots correct

The pen stick will not use ISO9660 to boot and you are playing with ISOhybrids, which are dual filesystem media, where it's easy to screw up one file system.

Unless you can come up with a pure ISO9660 ISO that can't be mounted, I'm just going to dismiss that issue as a mastering problem.

Real-Jogyi commented 4 years ago
  1. The non customized version boots without problems
  2. I am real grateful for your help, but I tend to disagree, because
    • I "burned" (some days ago) the .iso without Joliet extension and with isolevel 1 --> same error using EFI Boot
    • both this pure iso version and the remastered hybrid boot without problems using BIOS. What can go wrong with an .iso image so that EFI-only is not able to process it? maybe an bug in the used recent mkisofs release? I'll try again with an older version
Real-Jogyi commented 4 years ago

Supplement: All Debian based systems + Cygwin64 contain no native mkisofs. A with genisoimage v1.11 generated .iso start the known error ("-eltorito-platform" parameter not implemented in genisoimage)