andreiw / RaspberryPiPkg

DEPRECATED - DO NOT USE | Go here instead ->
https://github.com/tianocore/edk2-platforms/tree/master/Platform/RaspberryPi/RPi3
744 stars 143 forks source link

[RFC] Submitting RaspberryPiPkg to edk2-platforms #88

Closed pbatard closed 5 years ago

pbatard commented 5 years ago

Hi Andrei,

I am planning to submit your RaspberryPiPkg for integration into edk2-platforms, since I think it's gotten stable enough to try to officialize things and I am also hoping that having an inexpensive and widely popular ARM platform there will help getting more people interested in EDK2/UEFI development.

Before submitting your code for integration however, I want to give everyone interested an opportunity to comment on the proposal, a version of which can be found at https://github.com/pbatard/RaspberryPiPkg

Now, to try to keep things short, the points of interest I would like to bring up on that proposal are the following:

At this stage, depending on the feedback provided here, I am aiming at starting submitting patches for edk2-platforms integration around the middle of next week.

I'll conclude by saying a massive THANKS to Andrei and everybody who contributed to making a usable UEFI firmware for the Raspberry Pi a reality, which is something I have been wanting to see for a long time. Hopefully, we can now push it to the next step and make its formal integration into EDK2 a reality...

andreiw commented 5 years ago

Hi Pete,

Sounds good to me. I defer to you and the edk2-platforms developers as to the right way to upstream. Also, thanks for the the suggestions/fixes mentioned. As far as DSDT.offset.h, it's not used today, but will be once the ACPI code becomes a bit more complex.

Pretty much everything in edk2Patches is cosmetic, and to a degree, is probably a hack that can/should be done better.

Let me know when, I'll reference the upstreamed port, guiding people to use that instead when it is ready. I then intend on maintaining this repo as the more "experimental" tree (hopefully rebased to the upstream soon ;-)).

Very excited to see this happen.

A

pbatard commented 5 years ago

Great! Glad to see you onboard. 😄

Since this will be my first submission to edk2-platforms (I've only submitted to edk2 before), and I'm still very much learning about all the various elements that make the RaspberryPiPkg, I expect there will be some back and forth before it gets integrated. So I may have to rely on you for help.

I'll definitely be posting here, with links to the relevant edk2 mailing list threads, once I get the patches submitted.

Also, one thing I'm planning to do before sending patches is drop the leap422.dtb and use the official .dtb's provided from here under DeviceTree/. The reasoning behind that is that the firmware should probably be as generic as possible, and also, if specific distros want to see a .dtb altered, they'll probably send a request against the official raspberrypi/firmware repo, so we're probably better off using that content as base. Now, as far as I can see, the license under which these files are placed means that DeviceTree/ will need to move to edk2-non-osi, but that's no big deal.

Right now, my plan it to use bcm2710-rpi-3-b.dtb as default, and make sure to document in the Readme how users can override with another .dtb, such as bcm2710-rpi-3-b-plus.dtb through the device_tree=... option in the config file.

Of course, before I carry out these changes, I'm planning to validate that using bcm2710-rpi-3-b.dtb does indeed work as expected, so I'll try to confirm that with an Ubuntu installation, which means it may be a few more days before I start submitting patches.

andreiw commented 5 years ago

Sounds good to me.

Googulator commented 5 years ago

While you are at replacing leap422.dtb, would it be possible to provide the option to supply no DTB at all, to force ACPI? This is not currently supported in mainline Linux, but I intend to work on getting ACPI boot possible in Linux (which requires passing no DTB at all, as Linux never uses ACPI if a DTB is present).

pbatard commented 5 years ago

I'd rather go with the principle of least astonishment, and provide a default Device Tree, since this is what most modern distros seem to use and expect, and I suspect not providing one by default will go against the majority's expectation. The way I see it, once/if the proposal is accepted, people will most likely just pick the default binary .fd, compiled with the vanilla Pkg options, and fully expect it to include basic features such as ATF + a default Device Tree (which is also why I am not planning to remove ATF from the proposal either, even if I could do so).

Now, if you have data that seems to indicate that the majority of people who are going to install Linux on Pi through UEFI will not want to have a Device Tree provided by default (i.e. without having to go through a separate file), I may reconsider. But that's certainly not the picture I have at the moment, especially after having tested various distros.

One thing that can probably be done however, to help people who don't want a DT, is to have a Pkg compilation option to remove it, but that would not be enabled by default. I may even merge that with an option I'm considering, that would select which of the 3B or 3B+ .dtb should be integrated at compilation time (through something like -D ADD_DTB=[3B|3B_PLUS|NONE]).

Now, this option is something I am planning to look into after I have submitted a first proposal, since I expect the integration process to be slow and I therefore want to initiate it ASAP, with what I see as a minimal viable starting point.

Speaking of which, even as I think I have now sorted out the use of the default official Device Trees (after some patching of the USB section, without which USB keyboards didn't seem to work at all during the Ubuntu installation process), I very much still have a major showstopper, in that the current firmware I generate seems to produce Synchronous Exception at 0x000000003A71#### and freeze when rebooting from Linux, whereas the one recompiled from Andrei's tree on the same platform (with a reverted + patched EDK2) doesn't.

So until I have sorted this, I cannot submit the proposal to EDK2, since a firmware that freezes on reboot is pretty much useless...

Of course, I'm still looking into this last issue, and I'm hoping to have a fix for it soon enough, one way or another, especially as, if I compile my repo after applying Andrei's patches against a reverted EDK2, I can confirm that the problem does go away. So my guess is that, unlike what I thought, one of the EDK2 patches might be needed after all (and, from what I can tell right now, it doesn't seem to be 0002 or 0003, but I still need to look into it further)...

pbatard commented 5 years ago

Shoot, I was hoping this was due to the patches, but it's with the edk2 itself...

As long as I'm compiling with edk2@989f7a2c... (with or without the patches applied), everything is fine, but with latest edk2 (with or without the patches) I get the Synchronous Exception on reboot.

I'm going to have to bisect the whole slew of EDK2 changes that intervened between 2018.05.11 and now, to try to figure out where they introduced this breaking change, so this might take a while...

And unfortunately, I can't seem to be able to use the DEBUG version of the firmware to get a better idea of what triggers the exception, as the DEBUG version freezes the Pi even earlier, when the Linux reboot process is about to reset the CPU (and you don't even get an exception report over serial then).

pbatard commented 5 years ago

Oh man, what a wild ride! It turns out the issue had mostly to do with using a GCC 6+ AARCH64 toolchain, which is the default I was using since it's the one that Debian 9.5 will install (but I also confirmed the same with Linaro's 7.x as well).

I won't go into the details here, because I'm not sure I understand half the reasons behind the various weird behaviours I observed.

However, what I can state is that, when using Linaro's GCC 5.5 toolchain, with the current repo from pbatard/RaspberryPiPkg, along with an unpatched latest edk2, then the firmware does behave as expected (i.e. in the same manner as the Oct 1st binary from Andrei), and, even as I tried my hardest, I have not seen a single freeze on Linux reboot.

So I guess that means I should now be able to prepare a set of patches to submit to the EDK2, which I'll probably do at the beginning of next week.

pbatard commented 5 years ago

Okay, after some more testing, it turns out that patch 0005 (Ax88772b-not-a-runtime-driver) is needed after all, to prevent another potential firmware freezout. And you also don't want to go overboard with the common-page-size= options in the .dsc. I have updated my repo (which I also rebased against Andrei's latest changes) to reflect that.

pbatard commented 5 years ago

Change of plan.

I said I wouldn't touch ATF, but after some consideration, I reckon that, since rpi3 has been integrated as a platform in the official ATF repo, the EDK2 people are likely going to require the ATF we use to be current. Besides, since the goal is to provide an inexpensive UEFI platform for people to experiment with, and even if we know that ATF is kinda useless on the Pi in terms of providing much trust, it does make sense to ensure that people can play with the latest ATF if they want.

Which means I have spent the last few days trying to sort out using the very latest ATF with the UEFI firmware. As far as Linux is concerned, I think I have squared things up. But I think I'm going to need some help with Windows, as ATF 2.0 doesn't seem to enable boot (and trying to update the most likely needed patch didn't produce much of anything).

Now, to cut a long story short, I'll paste my notes regarding getting ATF to work with the UEFI firmware, and especially with regards to the various commits that broke (or fixed) things:

As a result of the latest ATF mappings, the UEFI proposal sets the following regions:

0x00000000 +-----------------+
           |       BL1       |
0x00010000 +-----------------+
           |       DTB       | (if provided by config.txt)
0x00020000 +-----------------+
           |       FIP       |
0x00030000 +-----------------+
           |                 |
           |       UEFI      |
           |                 |
           |       ...       |

These changes are mostly the result of trying to stay close to the default mapping used by the ATF rpi3 implementation, but I also think having 64 KB breathing room for each section may help, especially as the default device tree binaries are already flirting with the 32 KB...

For details on how to build the ATF from latest, for use with the UEFI firmware, see https://github.com/pbatard/RaspberryPiPkg/tree/master/Binary.

Now of course, the one major drawback of using the latest ATF is that Windows on ARM doesn't boot. Or rather, it does start to boot but freezes/goes into an endless loop somewhere during that process. If you feel inclined to help, I could probably use some, as it looks like figuring out what Windows is really unhappy about is going to be tricky to crack. But if we can do that, I will certainly try to submit the relevant patches to ATF, so that we can then use the vanilla version of ATF all the way.

But that also means that, until we got Windows + latest ATF sorted out, I'm going to delay submitting anything to edk2-platforms...

andreiw commented 5 years ago

Can you talk more about what you see with booting Windows on your ATF? Screenshots, output from HypDxe, anything...

I'm assuming PSCI CPU_ON actually works now on upstream ATF and you've looked at https://github.com/andreiw/raspberry-pi3-atf/commit/3565e7390b6e67d8e589a8586f9bf527c980d214?

pbatard commented 5 years ago

Can you talk more about what you see with booting Windows on your ATF? Screenshots, output from HypDxe, anything...

I haven't instrumented anything yet (still sorting out the ATF serial issue), so all I've seen so far is a black screen.

I'm assuming PSCI CPU_ON actually works now on upstream ATF

No idea. Currently, I'm assuming it doesn't.

and you've looked at andreiw/raspberry-pi3-atf@3565e73?

Yeah, that's the "most likely needed patch" I mentioned, which I've tried to apply in 2 different ways (since they've added a bunch of stuff in cm_prepare_el3_exit()), but that didn't seem to change anything. Right now, I strongly suspect something else is needed...

pbatard commented 5 years ago

Windows boot failing (ATF 2.0):

Probing for DBGKD_GET_VERSION64...
Searching for DBGKD_GET_VERSION64...
Detected arm64fre build 17134, NT base = 0xFFFFF80039B80000
Matched DBGKD_GET_VERSION64 @ FFFFF80039EA5F40 relative to FFFFF80039EA5F80
0xFFFFF8003A480048: Forwarding SMC(0) 84000000 0 0 0

Windows boot working (ATF 1.3):

Probing for DBGKD_GET_VERSION64...
Searching for DBGKD_GET_VERSION64...
Detected arm64fre build 17134, NT base = 0xFFFFF803C7DF0000
Matched DBGKD_GET_VERSION64 @ FFFFF803C8115F40 relative to FFFFF803C8115F80
0xFFFFF803C7A8F048: Forwarding SMC(0) 84000000 0 0 0
0xFFFFF803C7A8F048: Forwarding SMC(0) 8400000A C4000001 0 0
0xFFFFF803C7A8F048: Forwarding SMC(0) 8400000A 8400000B 0 0
0xFFFFF803C7A8F048: Forwarding SMC(0) 8400000A C400000C 0 0
0xFFFFF803C7A8F048: Forwarding SMC(0) 8400000A 8400000F 0 0
0xFFFFF803C7A8F048: Forwarding SMC(0) 8400000A C4000010 0 0
0xFFFFF803C7A8F048: Forwarding SMC(0) 8400000A C4000011 0 0
CPU_ON for core 1 0x39E76010(0x40C000)
Remaining pages: 8
4096 of EL2 stack at 33D0D000
CPU_ON for core 2 0x39E76010(0x40D000)
Remaining pages: 7
4096 of EL2 stack at 33D0E000
CPU_ON for core 3 0x39E76010(0x40E000)
Remaining pages: 6
4096 of EL2 stack at 33D0F000

By the way, looks like I spoke too soon when I said that I had a workaround for 0002-BaseTools-tools_def-support-ASLC-files-on-AArch64.patch. Looks like this fix is still needed, but only for DEBUG and not RELEASE. So that's one more thing I'll need to properly sort out before submitting patches to the EDK2.

andreiw commented 5 years ago

Can you instrument HypSMCProcess and see what is being returned from the forwarded calls?

A

13 нояб. 2018 г., в 11:28, Pete Batard notifications@github.com написал(а):

Windows boot failing (ATF 2.0):

Probing for DBGKD_GET_VERSION64... Searching for DBGKD_GET_VERSION64... Detected arm64fre build 17134, NT base = 0xFFFFF80039B80000 Matched DBGKD_GET_VERSION64 @ FFFFF80039EA5F40 relative to FFFFF80039EA5F80 0xFFFFF8003A480048: Forwarding SMC(0) 84000000 0 0 0 Windows boot working (ATF 1.3):

Probing for DBGKD_GET_VERSION64... Searching for DBGKD_GET_VERSION64... Detected arm64fre build 17134, NT base = 0xFFFFF803C7DF0000 Matched DBGKD_GET_VERSION64 @ FFFFF803C8115F40 relative to FFFFF803C8115F80 0xFFFFF803C7A8F048: Forwarding SMC(0) 84000000 0 0 0 0xFFFFF803C7A8F048: Forwarding SMC(0) 8400000A C4000001 0 0 0xFFFFF803C7A8F048: Forwarding SMC(0) 8400000A 8400000B 0 0 0xFFFFF803C7A8F048: Forwarding SMC(0) 8400000A C400000C 0 0 0xFFFFF803C7A8F048: Forwarding SMC(0) 8400000A 8400000F 0 0 0xFFFFF803C7A8F048: Forwarding SMC(0) 8400000A C4000010 0 0 0xFFFFF803C7A8F048: Forwarding SMC(0) 8400000A C4000011 0 0 CPU_ON for core 1 0x39E76010(0x40C000) Remaining pages: 8 4096 of EL2 stack at 33D0D000 CPU_ON for core 2 0x39E76010(0x40D000) Remaining pages: 7 4096 of EL2 stack at 33D0E000 CPU_ON for core 3 0x39E76010(0x40E000) Remaining pages: 6 4096 of EL2 stack at 33D0F000 By the way, looks like I spoke too soon when I said that I had a workaround for 0002-BaseTools-tools_def-support-ASLC-files-on-AArch64.patch. Looks like this fix is still needed, but only for DEBUG and not RELEASE. So that's one more thing I'll need to properly sort out before submitting patches to the EDK2.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

pbatard commented 5 years ago

Not much luck so far. I instrumented both HypExceptionHandler() and HypSMCProcess() to dump the SystemContext data, but it looks like we're only calling into HypSMCProcess() once:

FSOpen: Open '\EFI\BOOT\BOOTAA64.EFI' Success
[Bds] Expand VenHw(58ABD787-F64D-4CA2-A034-B9AC2D5AD0CF) -> VenHw(58ABD787-F64D-4CA2-A034-B9AC2D5AD0CF)/HD(1,MBR,0x0281D524,0x800,0x40000)/\EFI\BOOT\BOOTAA64.EFI
[Security] 3rd party image[0] can be loaded after EndOfDxe: VenHw(58ABD787-F64D-4CA2-A034-B9AC2D5AD0CF)/HD(1,MBR,0x0281D524,0x800,0x40000)/\EFI\BOOT\BOOTAA64.EFI.
InstallProtocolInterface: 5B1B31A1-9562-11D2-8E3F-00A0C969723B 365A53C0
add-symbol-file bootmgfw.pdb 0x10000400
Loading driver at 0x00010000000 EntryPoint=0x0001001B3F0 bootmgfw.efi
FSOpen: Open 'RPI_EFI.FD' Success
Variables dumped!
InstallProtocolInterface: BC62157E-3E33-4FEC-9920-2D3B36D750DF 365A6698
ProtectUefiImageCommon - 0x365A53C0
  - 0x0000000010000000 - 0x0000000000177000
ConvertPages: Incompatible memory types, the pages to allocate have been allocated
SetUefiImageMemoryAttributes - 0x0000000037260000 - 0x0000000000040000 (0x0000000000000008)
SetUefiImageMemoryAttributes - 0x0000000033EF0000 - 0x0000000000040000 (0x0000000000000008)
SetUefiImageMemoryAttributes - 0x0000000033EA0000 - 0x0000000000040000 (0x0000000000000008)
SetUefiImageMemoryAttributes - 0x0000000033E30000 - 0x0000000000040000 (0x0000000000000008)
SetUefiImageMemoryAttributes - 0x0000000033D90000 - 0x0000000000040000 (0x0000000000000008)
SetUefiImageMemoryAttributes - 0x0000000033CE0000 - 0x0000000000050000 (0x0000000000000008)
SetUefiImageMemoryAttributes - 0x0000000033BB0000 - 0x0000000000040000 (0x0000000000000008)
SetUefiImageMemoryAttributes - 0x0000000033B10000 - 0x0000000000040000 (0x0000000000000008)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464BE7 FFFFF8004D770000 33CF4000 33CF4BE7)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464BEF FFFFF8004D770000 33CF4000 33CF4BEF)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464C05 FFFFF8004D770000 33CF4000 33CF4C05)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464C1C FFFFF8004D770000 33CF4000 33CF4C1C)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464C32 FFFFF8004D770000 33CF4000 33CF4C32)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464C4B FFFFF8004D770000 33CF4000 33CF4C4B)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464C5E FFFFF8004D770000 33CF4000 33CF4C5E)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464C69 FFFFF8004D770000 33CF4000 33CF4C69)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464C7B FFFFF8004D770000 33CF4000 33CF4C7B)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464C87 FFFFF8004D770000 33CF4000 33CF4C87)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464C3A FFFFF8004D770000 33CF4000 33CF4C3A)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464C97 FFFFF8004D770000 33CF4000 33CF4C97)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464CA1 FFFFF8004D770000 33CF4000 33CF4CA1)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464CAE FFFFF8004D770000 33CF4000 33CF4CAE)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464CBE FFFFF8004D770000 33CF4000 33CF4CBE)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464CCF FFFFF8004D770000 33CF4000 33CF4CCF)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464CDE FFFFF8004D770000 33CF4000 33CF4CDE)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464CEA FFFFF8004D770000 33CF4000 33CF4CEA)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464CF3 FFFFF8004D770000 33CF4000 33CF4CF3)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464D01 FFFFF8004D770000 33CF4000 33CF4D01)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464D0B FFFFF8004D770000 33CF4000 33CF4D0B)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464D19 FFFFF8004D770000 33CF4000 33CF4D19)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464D25 FFFFF8004D770000 33CF4000 33CF4D25)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464D30 FFFFF8004D770000 33CF4000 33CF4D30)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464D39 FFFFF8004D770000 33CF4000 33CF4D39)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464D45 FFFFF8004D770000 33CF4000 33CF4D45)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464D55 FFFFF8004D770000 33CF4000 33CF4D55)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464D5D FFFFF8004D770000 33CF4000 33CF4D5D)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464D68 FFFFF8004D770000 33CF4000 33CF4D68)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464D73 FFFFF8004D770000 33CF4000 33CF4D73)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464D82 FFFFF8004D770000 33CF4000 33CF4D82)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464D97 FFFFF8004D770000 33CF4000 33CF4D97)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464DAA FFFFF8004D770000 33CF4000 33CF4DAA)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464DB4 FFFFF8004D770000 33CF4000 33CF4DB4)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464DC1 FFFFF8004D770000 33CF4000 33CF4DC1)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464DCF FFFFF8004D770000 33CF4000 33CF4DCF)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464DDD FFFFF8004D770000 33CF4000 33CF4DDD)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464DE9 FFFFF8004D770000 33CF4000 33CF4DE9)
HypExceptionHandler: ELR, ESR = (372706A0, 93C0804F), X0-X3 = (FFFFF80081464DFA FFFFF8004D770000 33CF4000 33CF4DFA)
HypExceptionHandler: ELR, ESR = (FFFFF8007F3A0048, 5E000000), X0-X3 = (84000000 0 0 0)
Probing for DBGKD_GET_VERSION64...
Searching for DBGKD_GET_VERSION64...
Detected arm64fre build 17134, NT base = 0xFFFFF8007EAA0000
Matched DBGKD_GET_VERSION64 @ FFFFF8007EDC5F40 relative to FFFFF8007EDC5F80
HypSMCProcess: ELR, ESR = (FFFFF8007F3A0048, 5E000000), X0-X3 = (84000000 0 0 0)
0xFFFFF8007F3A0048: Forwarding SMC(0) 84000000 0 0 0

By the way, if you want the DEBUG version to compile, without getting AARCH64 small code model requires identical ELF and PE/COFF section offsets error, and without patching EDK2, you can simply add GCC:DEBUG_*_AARCH64_CC_FLAGS = -mcmodel=tiny in AcpiTables/AcpiTables.inf.

pbatard commented 5 years ago

Some more information that may be relevant:

So I guess there is something in what Ard's did, that has not been carried into the official rpi3 platform support...

andreiw commented 5 years ago

Another dumb question... does it return back to HypSMCProcess from the ATF PSCI handler? What is the return value compared to the ATF that works?

thchi12 commented 5 years ago

Could you try RS5 builds higher than 17723 and boot in el2? These builds should have official support for RPi3 boot.

andreiw commented 5 years ago

@thchi12 I don't think that's what it is...if anything, HypDxe allows to monitor what's going on between Win<->ATF without breaking out a J-Link...

pbatard commented 5 years ago

I think I have sorted it out now. Seems it all had to do with using a memory mapping that Windows did not like for the ATF (whereas Linux doesn't seem to care that much).

With the following in platform_def.h, the very latest ATF from git, as well as earlier versions, seem to keep Windows happy 😄:

#define SEC_ROM_BASE            ULL(0x00000000)
#define SEC_ROM_SIZE            ULL(0x00010000)

#define PLAT_RPI3_FIP_BASE      ULL(0x00020000)
#define PLAT_RPI3_FIP_MAX_SIZE  ULL(0x00010000)

#define NS_DRAM0_BASE           ULL(0x00400000)
#define NS_DRAM0_SIZE           ULL(0x00c00000)

#define SEC_SRAM_BASE           ULL(0x00200000)
#define SEC_SRAM_SIZE           ULL(0x00050000)

#define SEC_DRAM_BASE           ULL(0x00250000)
#define SEC_DRAM_SIZE           ULL(0x001b0000)

I guess I should have tried to stay closer to Andrei's settings, as it appears I got into some kind of hybrid memory mapping, from trying to keep as much of the ATF defaults as I could.

Once https://github.com/ARM-software/arm-trusted-firmware/pull/1680 has been integrated into ATF (fixes Ubuntu reboot), I'll build new ATF binaries with the updated memory map, and add these to my repo, so that we have a version of the UEFI firmware that can be submitted to edk2-platform.

I may still run some tests with Secure Boot, as well as see if there's any way we can get what we need from ATF without having to dirty the repo (otherwise you get stuff like NOTICE: BL1: v2.0(release):v2.0-250-g9793e035-dirty on the console), but unless I missed something, I think, at long last, we're finally in a position to submit something to edk2-platforms...

andreiw commented 5 years ago

I'm glad you have it sorted out. It is pretty easy to have the edk2 view of where the reserved memory is go out of sync with what ATF is using, and that has caused mischief in the past.

pbatard commented 5 years ago

Yeah, the problem is that ATF is really geared towards uboot, and their memory mapping preferences seem to be quite different from ours. Plus, they didn't use the same mapping as Ard's proposal when they introduced the platform and changed it further between 1.5 and 1.6, as well as post 2.0, so every time I was trying to apply minimal changes, to keep close to vanilla ATF, I was getting nowhere...

Now, the other good thing about being able to use latest ATF is that I don't seem to require patching edk2 for Ax88772b / runtime driver any more, which is good news.

I also just received the Pi3 model B+ that I ordered a couple weeks ago (so far I had only been testing with a Pi3 model B). The bad news is that vanilla ARM64 Ubuntu seems to have trouble with the wired ethernet driver, so network install does not work, and you also get not eth0 on the system. But that has nothing to do with the UEFI firmware (other people are also reporting this, who aren't using UEFI) and seems to be a consequence of the lan78xx Ubuntu driver needing an update. Apart from that, everything looks good and especially, I've been able to validate that my slightly modified B+ device tree works as expected.

I think I may still take the rest of the week to properly test and clean everything up, and then send patches to the EDK2 at the beginning of next week.

andreiw commented 5 years ago

Oh man, what a wild ride! It turns out the issue had mostly to do with using a GCC 6+ AARCH64 toolchain, which is the default I was using since it's the one that Debian 9.5 will install (but I also confirmed the same with Linaro's 7.x as well).

I won't go into the details here, because I'm not sure I understand half the reasons behind the various weird behaviours I observed.

However, what I can state is that, when using Linaro's GCC 5.5 toolchain, with the current repo from pbatard/RaspberryPiPkg, along with an unpatched latest edk2, then the firmware does behave as expected (i.e. in the same manner as the Oct 1st binary from Andrei), and, even as I tried my hardest, I have not seen a single freeze on Linux reboot.

So I guess that means I should now be able to prepare a set of patches to submit to the EDK2, which I'll probably do at the beginning of next week.

Btw, I did want to come back to this... GCC6 will need to new tools definitions? What were you observing? I've never been happy that there is no real stability to GCC options, i.e. what may work with GCC4 doesn't with GCC5 etc.

pbatard commented 5 years ago

GCC6 will need to new tools definitions?

Possibly, though using the GCC5 ones with GCC6 seems to work fine most of the time. That's for the EDK2 people to tell.

What were you observing?

Intermittent Synchronous Exceptions when ATF was handing over to UEFI, resulting in compete platform freeze. Might work fine 5-6 times in a row, and then you'd get the exception, which made it very troublesome to find whether a build was really working or not.

I may try again with GCC6 now that we have an up to date ATF, to see if it changes things. But I'll most likely do that after I have started submitting the patches, as there's not much to be gain from not using the EDK2 recommended GCC version.

andreiw commented 5 years ago

Interesting. I have observed something similar (but dont remember if those were gcc49 or gcc5 builds). I think in my case I attributed the failures to hw glitching after power cycling it too fast (e.g unplugging and plugging the micro usb real fast)

A

14 нояб. 2018 г., в 19:37, Pete Batard notifications@github.com написал(а):

GCC6 will need to new tools definitions?

Possibly, though using the GCC5 ones with GCC6 seems to work fine most of the time. That's for the EDK2 people to tell.

What were you observing?

Intermittent Synchronous Exceptions when ATF was handing over to UEFI, resulting in compete platform freeze. Might work fine 5-6 times in a row, and then you'd get the exception, which made it very troublesome to find whether a build was really working or not.

I may try again with GCC6 now that we have an up to date ATF, to see if it changes things. But I'll most likely do that after I have started submitting the patches, as there's not much to be gain from not using the EDK2 recommended GCC version.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

pbatard commented 5 years ago

Well, I'm using a USB connector with an on/off switch (and a 3A PSU). I too thought that it may have to do with transient power supply issues at first, but I never ever saw those issues when using your Oct 1st firmware (as well as my recent builds once I started to use Linaro's GCC 5.5), whereas I always got the exceptions within 5-10 (re)boots. So it doesn't really fit the model of a hardware problem.

Anyway, just to report that I have re-tested a full Windows 10 1803 (17134) installation using the finalized version of what I propose to submit to edk2-platforms, and everything behaved the same as your Oct 1st binary (minus the minor stuff I haven't added yet, such as the Windows logo).

On the other hand, I'm not seeing any success with Windows 10 1809 (17763), with any firmware so far, which I assume is due to Microsoft having altered stuff that we need. Especially I seem to be getting BSOD with SYSTEM_THREAD_EXCPTION_NOT_HANDLED or DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS with WppRecorder.sys somewhere along the boot process. I figure I'll leave it to the community to try to sort that one out.

For reference, below is the final memory mapping I have settled on for ATF, and that seems to work just fine for both Windows and Linux:

    0x00000000 +-----------------+
               |       ROM       | BL1
    0x00010000 +-----------------+
               |       DTB       | (Loaded by the VideoCore)
    0x00020000 +-----------------+
               |       FIP       |
    0x00030000 +-----------------+
               |                 |
               |  UEFI PAYLOAD   |
               |                 |
    0x00200000 +-----------------+
               |   Secure SRAM   | BL2, BL31
    0x00300000 +-----------------+
               |   Secure DRAM   | BL32 (Secure payload)
    0x00400000 +-----------------+
               |                 |
               |                 |
               | Non-secure DRAM | BL33
               |                 |
               |                 |
    0x01000000 +-----------------+
               |                 |
               |       ...       |
               |                 |
    0x3F000000 +-----------------+
               |       I/O       |

I have therefore submitted one last pull request (ARM-software/arm-trusted-firmware#1685) to ATF, so that we should be able to use vanilla builds of bl1.bin and fip.bin.

driver1998 commented 5 years ago

It seems a driver compatible issue on 17763, we might need to recompile drivers with 17763 WDK. There are some success when replacing wpprecoder.sys with older versions.

thchi12 commented 5 years ago

tried that and it seems useless. BSoD still happen without replacing that sys file. btw that bsod happens just before oobe starts....2nd phase of setup should complete without problems.

pbatard commented 5 years ago

ATF is now fully sorted out, but one last hurdle I'm now realising we have is that the current proposal seems to use a modified version of the Raspberry Pi foundation logo.

First of all, I'm pretty sure this means at least the Logo driver will need to go to edk2-non-osi, since it contains assets whose use is governed by a license that is not compatible with the EDK2's (at the very least, as per their guidelines, the license needs to mention “Raspberry Pi is a trademark of the Raspberry Pi Foundation”).

Also, I'm quite certain that, even if moved to edk2-non-osi, the first thing the EDK2 people will want to get confirmation of is that we have the rights to use that logo.

Now, reading the official visual guidelines as well as the trademark rules, it looks like there might be 3 things that run contrary to what the Raspberry Pi Foundation requires:

  1. The ® (Registered Trademark) symbol does not appear on the logo (Logo restriction rule 1)
  2. We removed the leaf and fruit borders (Logo restriction rule 2).
  3. Not to use the logo in a way that gives the false impression that a product is endorsed by the Raspberry Pi Foundation

Now, 1 is easy to fix, so I'll work on that. And we can assert that point 2 is just a consequence of having to use a black background (in other words, we did not technically remove elements — they just happen to end up being merged with the background we need). But as to point 3, I think we need some confirmation from the Raspberry Pi Foundation that they are okay with our use of the logo as part of an EDK2/UEFI project, even as they have had nothing to do with it.

I will therefore be e-mailing trademarks@raspberrypi.org to get their input regarding our planned use of the Raspberry Pi logo (as well as what they would like to see appear in the license text for the logo in the edk2-non-osi directory) before I submit the project for EDK2 inclusion.

PS: Oh, and it also looks to me like the .png and .pnm files are not needed in any way, so I will be removing them.

pbatard commented 5 years ago

Still waiting for trademarks@raspberrypi.org's input wrt the logo. I asked them for an update, and this was their reply:

Thanks for your email. Your email has been sent to our Director of Marketing who shall be responding to you very soon. They are currently in discussions with a number of the Trading team about your email and request. As soon as they have a response we shall be in touch.

Obviously, I can't submit anything to the EDK2 until they tell us whether they are okay or not with our proposed use of the logo.

andreiw commented 5 years ago

Sure, I understand, thanks for reaching out to the org to help resolve this.

A

23 нояб. 2018 г., в 11:57, Pete Batard notifications@github.com написал(а):

Still waiting for trademarks@raspberrypi.org's input wrt the logo. I asked them for an update, and this was their reply:

Thanks for your email. Your email has been sent to our Director of Marketing who shall be responding to you very soon. They are currently in discussions with a number of the Trading team about your email and request. As soon as they have a response we shall be in touch.

Obviously, I can't submit anything to the EDK2 until they tell us whether they are okay or not with our proposed use of the logo.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

pbatard commented 5 years ago

Good news: The Raspberry Pi Foundation have just replied and they are giving us a green light for the use of the logo. They will even produce a version for us that looks good on a black background:

Yes, you can use our logo, and we'll provide you with a version that's suitable to display against a black background (we're happy for you to use a version without the (R) symbol). I've asked our designers to let me have a suitable file, and I'll pass it on when they get back to me - they're usually extremely quick.

andreiw commented 5 years ago

Wow, fantastic news! Congrats!

A

29 нояб. 2018 г., в 12:37, Pete Batard notifications@github.com написал(а):

Good news: The Raspberry Pi Foundation have just replied and they are giving us a green light for the use of the logo. They will even produce a version for us that looks good on a black background:

Yes, you can use our logo, and we'll provide you with a version that's suitable to display against a black background (we're happy for you to use a version without the (R) symbol). I've asked our designers to let me have a suitable file, and I'll pass it on when they get back to me - they're usually extremely quick.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

pbatard commented 5 years ago

Got the logo now 😄:

logo

Apparently they decided to go with an all black and white one. Can't say I'm not missing the colours from the previous logo a little bit, but I gotta say that new logo looks pretty sleek and it does grow on you. Plus, the logo being greyscale means we can reduce the size of the .bmp and probably the size it occupies in the firmware, which might come handy.

Also they sent us the EPS file (see here), which means we can resize it to any dimensions we want and it'll still look good. For the time being, I decided to go with a 480px height.

With this, as well as additional cleanup prompted by the EDK2 validation tools (see this patch) I should hopefully be good to begin submitting a patchset on Monday.

andreiw commented 5 years ago

Neat! Great news!

A

30 нояб. 2018 г., в 13:13, Pete Batard notifications@github.com написал(а):

Got the logo now 😄:

Apparently they decided to go with an all black and white one. Can't say I'm not missing the nice colours for the previous logo a little bit, but I gotta say that new logo looks pretty sleek and it does grow on you. Plus, the logo being greyscale means we can reduce the size of the .bmp and probably the size it occupies in the firmware, which might come handy.

Also they sent us the EPS file (see here), which means we can resize it to any dimensions we want and it'll still look good. For the time being, I decided to go with a 480px height.

With this, as well as additional cleanup prompted by the EDK2 validation tools (see this patch) I should hopefully be good to being submitting a patchset on Monday.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

pbatard commented 5 years ago

...and the latest EDK2 update just broke Raspberry Pi support, so I can't submit the patches. 😭

They literally pushed a change in the last couple of days that does away with gEmbeddedTokenSpaceGuid.PcdPrePiCpuMemorySize (see this and this as well as related commits that were applied on Nov 29th).

Obviously, this means that we must remove the references to PcdPrePiCpuMemorySize we have in the .dsc and Drivers/RaspberryPiPlatformLib/RaspberryPiPlatformLib.inf.

Except, when you do that and let ArmGetPhysicalAddressBits() be used instead, you get the following exception when booting the firmware:

HypDxe at 0x33CE0000-0x33D2FFFF
mPages at 0x33D02000-0x33D15FFF
HCR_EL2         0x8000002
CPTR_EL2        0x33FF
CNTHCTL_EL2     0x3
CNTVOFF_EL2     0x1C0AA95321000000
VPIDR_EL2       0x410FD034
VTTBR_EL2       0x6F83BE0113A0
VTCR_EL2        0x8000030A
SCTLR_EL2       0x30C5183D
ACTLR_EL2       0x0
TTBR0_EL2       0x3B3F8000
TCR_EL2         0x80823518
VBAR_EL2        0x3A701800
MAIR_EL2        0xFFBB4400
Remaining pages: 16
16384 of EL2 stack at 33D02000
ASSERT [HypDxe] /usr/src/edk2-platforms/Platform/Broadcom/Bcm283x/Drivers/HypDxe/HypDxe.c(181): X(State->Tcr, 0, 5) == (64 - 32)
Unexpected exception from EL2!
ESR 0x5600DBDB (EC 0x15 ISS 0xDBDB)
FAR = 0x7FDFBD7EFFFFFAAB

And indeed when instrumenting the code, I can see that X(State->Tcr, 0, 5) is 24 and not the expected 32.

I also tried commenting out the ASSERT() altogether, but the firmware simply freezes after going through a couple more pages.

I'm afraid I don't know enough about what HypDxe does to have any idea how to fix this right now, so I may need some help with that...

pbatard commented 5 years ago

I'm going to have to drop HypDxe from the EDK2 submission for now, as well as the related menu references and stuff. I just spent 2 days trying to figure out this MMU stuff (for the record, whoever designed this super convoluted ARMv8 MMU certainly deserves to be moved to a damp basement office... without windows), and while I managed to get a DEBUG version of RPI_EFI.fd boot with the code below, the RELEASE version still freezes, which is all the more frustrating as it means we have to figure what's causing the issue completely in the dark:

STATIC EFI_STATUS
HypBuildPT(
  IN  CAPTURED_EL2_STATE *State
  )
{
  UINT64 *PL0;
  EFI_PHYSICAL_ADDRESS A = 0;
  EFI_PHYSICAL_ADDRESS E = BCM2836_SOC_REGISTERS +
    BCM2836_SOC_REGISTER_LENGTH;
  /*
   * Custom Page Table builder.
   *
   * T0SZ assumed 24-bit.
   * Granule 4K.
   */
  ASSERT (X(State->Tcr, 0, 5) == 24);
  ASSERT (X(State->Tcr, 14, 15) == 0);

  PL0 = (VOID *) HypMemAlloc(1);
  if (PL0 == NULL) {
    HLOG((HLOG_ERROR, "Couldn't alloc L0 table\n"));
    return EFI_OUT_OF_RESOURCES;
  }

  ZeroMem(PL0, EFI_PAGE_SIZE);

  while (A < E) {
    /*
     * PL1 covers 512GB (512 entries ^ (4-1) * 4K Granule)
     * PL2 covers 1GB   (512 entries ^ (4-2) * 4K Granule)
     */
    UINT64 *PL1;
    UINT64 *PL2;

    UINTN ix0 = VA_2_PL0_IX(A);
    UINTN ix1 = VA_2_PL1_IX(A);
    UINTN ix2 = VA_2_PL2_IX(A);

    PL1 = PTE_2_TAB(PL0[ix0]);
    if (PTE_2_TYPE(PL0[ix0]) != PTE_TYPE_TAB) {
      PL1 = (VOID *)HypMemAlloc(1);
      if (PL1 == NULL) {
        HLOG((HLOG_ERROR, "Couldn't alloc L1 table for %u\n", ix0));
        return EFI_OUT_OF_RESOURCES;
      }

      ZeroMem(PL1, EFI_PAGE_SIZE);
      PL0[ix0] = ((UINTN)PL1) | PTE_TYPE_TAB;
    }

    PL2 = PTE_2_TAB(PL1[ix1]);
    if (PTE_2_TYPE(PL1[ix1]) != PTE_TYPE_TAB) {
      PL2 = (VOID *) HypMemAlloc(1);
      if (PL2 == NULL) {
        HLOG((HLOG_ERROR, "Couldn't alloc L2 table for %u\n",
              ix1));
        return EFI_OUT_OF_RESOURCES;
      }

      ZeroMem(PL2, EFI_PAGE_SIZE);
      PL1[ix1] = ((UINTN) PL2) | PTE_TYPE_TAB;
    }

    PL2[ix2] = PTE_TYPE_BLOCK | PTE_RW | PTE_SH_INNER |
      PTE_AF | A;
    if (A >= BCM2836_SOC_REGISTERS) {
      PL2[ix2] |= PTE_ATTR_DEV;
    } else {
      PL2[ix2] |= PTE_ATTR_MEM;
    }

    A += SIZE_2MB;
  }

  HLOG((HLOG_INFO, "Setting page table root to 0x%lx\n",
        (UINT64) PL0));

  DSB_ISH();
  WriteSysReg(ttbr0_el2, PL0);
  ISB();

  asm volatile("tlbi alle2");
  DSB_ISH();
  ISB();

  HLOG((HLOG_INFO, "It lives!\n"));

  return EFI_SUCCESS;
}

My current understanding is that, since the new EDK2 settings force us to use a T0SZ of 24 instead of 32-bit, we need to populate some Level 0 tables to handle address translations for bits [39:40] of the Virtual Addresses, and indeed, once I added a PL0 table in HypBuildPT(), I could get that function to complete (at least in the DEBUG version of the firmware).

However, since I'm still quite fuzzy about what we're trying to achieve here, I'm not exactly confident that my modifications are enough, especially when RELEASE does freeze when HypDxe is present (and I confirmed it was okay when HypDxe was excluded, so we can tell that whatever issue we're facing is indeed with the HypDxe initialization code), and I haven't modified HypBuildS2PT() (I gave a try at also starting at PL0 instead of PL1 in there, but that one made DEBUG bail out, so I didn't pursue it).

All this to say that, since I don't want to delay EDK2 submission for that much longer, and my understanding is that HypDxe is only a "nice to have" feature for Windows, I'm simply going to drop HypDxe and remove all related code for the time being.

Now, of course, if someone has a better idea on how we should properly fix HypDxe for a 40-bit Virtual Address space (which, if you don't want to have to use my repo, can probably be tested by increasing gEmbeddedTokenSpaceGuid.PcdPrePiCpuMemorySize in the .dsc), I'll take it! 😄

andreiw commented 5 years ago

Hi Pete,

Yikes. Okay.

So first of all, HypDxe is a craptastic "hypervisor", whose sole purpose in life is to patch Windows 10 builds that fail an ASSERT checking that they are not running on a Pi. It's the only way to boot some builds like 17134. Anyway, at some point, there was a nice gent from MS lurking in the big 'ol Windows thread, and he mentioned that the Pi check got removed. That means, that starting some build number, those builds can successfully boot in EL2 and without HypDxe. Because I haven't been plugged into the Windows events (as you noticed, the drivers suck, and that's a black hole I can't afford to be sucked into now ;-)), I have noooo idea what range corresponds to the good builds.

So if you go fetch the latest ADK / WinPE and that build boots fine without HypDxe - than we're all set, and HypDxe can be removed from whatever you're upstreaming. I mean it sure could still be useful, but clearly fixing it can be done out-of-band...

When HypDxe starts, it pushes UEFI into EL1, which is why it needs to create both page tables for itself (since it is resident after UEFI exits) and stage 2 tables (these are identity today, but were added to protect HypDxe from EL1 and to eventually support MMIO traps so I could emulate a GIC).

Anyway, I can take a look. What edk2 commit should I fast forward to, that introduced the T0SZ change?

P.S. every half a year I end up writing, from scratch, page table manipulation logic. Every half a year it looks completely foreign to me like some sort of alien hieroglyphics. Sigh. The Arm way has been always to push for simpler hardware at the expense of software complexity... I guess the good news is that at this point, Arm page table support is a SUPERSET of x86. It used to be completely disjoint... Well, it still could be, depending on how many software bits per PT you need :-(.

A

pbatard commented 5 years ago

whose sole purpose in life is to patch Windows 10 builds that fail an ASSERT checking that they are not running on a Pi.

Wow. I didn't realize that Microsoft had been so eager to defeat the very business model that made them super successful in the 80's (by making sure DOS and the early iterations of Windows could run on any x86 hardware and not just the IBM one). Glad to hear the came to their senses and removed that check.

It's the only way to boot some builds like 17134.

Okay. I haven't tested 17134 with the no HypDxe firmware since I overwrote my existing 17134 installation to play with 17763.0 (i.e. the first 1809 release, not the second one, a.k.a 17763.107, which I'm planning to test too once I am done removing HypDxe). The only problem with 17763 is that you need to replace WppRecorder.sys with an earlier version, such as 17134's, as it looks like 17763 BSOD's on boot if the system disk is sitting behind USB regardless of the architecture (i.e. I've seen the same thing happen with 17763.107 Windows To Go drives on x86_64 so I'm pretty sure this is an overall issue with whatever global modifications Microsoft carried out on the driver between 17134 and 17763, and not something that is specific to ARM64 or Pi).

I'll report on my findings with 17134 once I have finalized the planned EDK2 submission with HypDxe removed. I also think not having HypDxe will probably help with the EDK2 integration, as I'm pretty sure the would have had questions about it and possibly requested modifications. As you said, we can always get that component re-integrated at a later time, so the simpler we can make the initial Pi submission, the better.

When HypDxe starts, it pushes UEFI into EL1, which is why it needs to create both page tables for itself (since it is resident after UEFI exits) and stage 2 tables (these are identity today, but were added to protect HypDxe from EL1 and to eventually support MMIO traps so I could emulate a GIC).

Thanks for clarifying this. From looking at the code, I more or less suspected this was the aim, but it's nice to see it explained.

Anyway, I can take a look. What edk2 commit should I fast forward to, that introduced the T0SZ change?

That would be great. But don't worry too much about this if you have other things to do as, if we can make 17763 work without HypDxe, fixing that driver becomes much less of a priority.

Now the T0SZ change is a result of the series of commits that drops gEmbeddedTokenSpaceGuid.PcdPrePiCpuMemorySize. So if you FF to bcf2a9db you should be able to test that. But of course you'll need to remove the PCD references in the .fdf and .inf. Alternatively, I strongly suspect that you can replicate the issue in your current repo, without fast-forwarding anything, by simply setting gEmbeddedTokenSpaceGuid.PcdPrePiCpuMemorySize|40

What happens with the latest EDK2 then is that, instead of using PcdPrePiCpuMemorySize as defined by the platform, which of course we used to have set to 32, a call to ArmGetPhysicalAddressBits() (introduced in 95d04ebc) is being used instead to set the maximum address bit-range according to whether LPAE is enabled or not. And at least for the Pi 3, because LPAE is enabled by default, we get a bit-range of 40 instead of 32.

From there, in ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c, a MaxAddress is computed from which T0SZ is derived and set (to 64 - 40 = 24 in our case), and this is the T0SZ we inherit in HypDxe.

every half a year I end up writing, from scratch, page table manipulation logic.

Oh man! I only spent a few hours on this and I already felt like I wanted to bang my head against a wall... I don't think I could bear being coerced into doing this kind of exercise twice a year... :scream:

andreiw commented 5 years ago

Nah it was totally one of those “here be dragons” sanity checks. If they really wanted to break it, there are better ways than shipping low-level support for it, and an assertion check... remember, the Pi is completely outside the boundaries of what a normal Windows Arm system looks like, so it needs a lot of low-level support code explicit to the Pi, that really didn’t have to be part of any ARM64 build (after all, there was no 64-bit firmware or a 64-bit IoT product).

Anyway. Don’t bother with 17134 or equally museum-worthy builds. 17134 will not work without HypDxe, and it completely doesn’t matter - what’s important is that the latest build still boots.

Well. I never booted from USB because when I last tried, dwusb couldn’t deal with mass storage devices. Always installed and booted from SD.

Ok, thanks, will see what I can do. I kind of regret making the HypDxe MMU code so lazy now, and I’ve been thinking of reusing it elsewhere anyway...

A

5 дек. 2018 г., в 19:09, Pete Batard notifications@github.com написал(а):

whose sole purpose in life is to patch Windows 10 builds that fail an ASSERT checking that they are not running on a Pi.

Wow. I didn't realize that Microsoft had been so eager to defeat the very business model that made them super successful in the 80's (by making sure DOS and the early iterations of Windows could run on any x86 hardware and not just the IBM one). Glad to hear the came to their senses and removed that check.

It's the only way to boot some builds like 17134.

Okay. I haven't tested 17134 with the no HypDxe firmware since I overwrote my existing 17134 installation to play with 17763.0 lately (i.e. the first 1809 release, not the second one, a.k.a 17763.107, which I'm planning to test too once I am done removing HypDxe). The only problem with 17763 is that you need to replace WppRecorder.sys with an earlier version, such as 17134's, as it looks like 17763 BSOD's on boot if the system disk is sitting behind USB regardless of the architecture (i.e. I've seen the same thing happen with 17763.107 Windows To Go drives on x86_64 so I'm pretty sure this is an overall issue with whatever global modifications Microsoft carried out on the driver between 17134 and 17763, and not something that is specific to ARM64 or Pi).

I'll report on my findings with 17134 once I have finalized the planned EDK2 submission with HypDxe removed. I also think not having HypDxe will probably help with the EDK2 integration, as I'm pretty sure the would have had questions about it and possibly requested modifications. As you said, we can always get that component re-integrated at a later time, so the simplest we can make the initial Pi submission, the better.

When HypDxe starts, it pushes UEFI into EL1, which is why it needs to create both page tables for itself (since it is resident after UEFI exits) and stage 2 tables (these are identity today, but were added to protect HypDxe from EL1 and to eventually support MMIO traps so I could emulate a GIC).

Thanks for clarifying this. From looking at the code, I more or less suspected this was the aim, but it's nice to see it explained.

Anyway, I can take a look. What edk2 commit should I fast forward to, that introduced the T0SZ change?

That would be great. But don't worry too much about this if you have other things to do as, if we can make 17763 work without HypDxe, fixing that driver becomes much less of a priority.

Now the T0SZ change is a result of the series of commits that drops gEmbeddedTokenSpaceGuid.PcdPrePiCpuMemorySize. So if you FF to bcf2a9db you should be able to test that. But of course you'll need to remove the PCD references in the .fdf and .inf. Alternatively, I strongly suspect that you can replicate the issue in your current repo, without fast-forwarding anything, by simply setting gEmbeddedTokenSpaceGuid.PcdPrePiCpuMemorySize|40

What happens with the latest EDK2 then is that, instead of using PcdPrePiCpuMemorySize as defined by the platform, which of course we used to have set to 32, a call to ArmGetPhysicalAddressBits() (introduced in 95d04ebc) is being used instead to set the maximum address bit-range according to whether LPAE is enabled or not. And at leas for the Pi 3, because LPAE is enabled by default, we get a bit-range of 40 instead of 32.

From there, in ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c, a MaxAddress is computed from which T0SZ is derived and set (to 64 - 40 = 24 in our case), and this is the T0SZ we inherit in HypDxe.

every half a year I end up writing, from scratch, page table manipulation logic.

Oh man! I only spent a few hours on this and I already felt like I wanted to bang my head against a wall... I don't think I could bear being coerced into doing this kind of exercise twice a year... 😱

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

pbatard commented 5 years ago

Well. I never booted from USB because when I last tried, dwusb couldn’t deal with mass storage devices. Always installed and booted from SD.

Same here.

I think I may have misremembered the Pi architecture and thought the SD controller sat behind an USB bus like the network controller, and that, since I saw the same BSOD when booting Windows To Go (on a different arch) then this had to do with USB. But if the SD does not sit behind a USB controller, then the issue with WppRecorder.sys is something even more fundamental that Microsoft will have to fix.

In other words, you do need to replace WppRecorder.sys with an older version when booting from SD.

driver1998 commented 5 years ago

I guess disabling WPP tracing in all the drivers we build may be a viable workaround. But that is far from ideal.

pbatard commented 5 years ago

Nah, the issue is not that our drivers are making WPP crash. The problem is that, unless you are running Windows from a regular SATA or SSD drive, the 17763 version of WPP will crash regardless. Again, I have seen this exact thing happen outside of the Pi/ARM64 realm, with a Windows To Go drive that I was trying to boot on an x64 intel Nuc, and that wasn't using any custom driver. And there again, replacing WPP with an earlier version fixed the BSOD. This is an issue that is being reported with 1809 all over the place (e.g. here - from what I gather those guys' "fix", which they don't document, is also to replace WPP with an earlier version), but that flew under the radar because there were even bigger issues with 1809...

Hopefully, Microsoft will have fixed WPP in 1903. In the meantime, it looks like if you want to run 1809 from something else than a SATA, M.2 or PCIe drive, you must replace WPP with the 1803 version, which works fine.

mariobalanica commented 5 years ago

This firmware does not run as is on all BCM283x SoCs. It only supports the BCM2837 chip (which has ARMv8 cores).

Also, I think it would be a good idea to replace all BCM2836 (used on the first Pi 2 model, currently unsupported) references with BCM2837 (Pi 3 B and later models).

pbatard commented 5 years ago

This firmware does not run as is on all BCM283x SoCs.

Never said it would. I deliberately used BCM283x in case we expand into the Pi2 (so that we can reuse stuff from the BCM283x/ directory rather than duplicate it into a new one), but for now, the firmware is only compatible with BCM2837.

I think it would be a good idea to replace all BCM2836 references with BCM2837

I'm not planning to do that at this stage, in case we reuse this code for the Pi2. I did replace all the PI2 references in the code with PI3 though.

If the EDK2 people complain, I may consider it, but until then, it seems like a waste of time.

By the way, just to provide an update, I think I now have the finalized version of what I am planning to submit in https://github.com/pbatard/RaspberryPiPkg (with HypDxe removed and some additional cleanup). If anyone feels like testing, I also uploaded 2 RPI_EFI.fd binaries, built from this and the very latest EDK2 in a v18.12.06 Release.

I tested the release binary above against a new Windows 10 1809 install (17763.107, from the retail en_windows_10_consumer_edition_version_1809_updated_sept_2018_arm64_dvd_699e6a11.iso), after replacing C:\Windows\System32\Drivers\WppRecorder.sys, and everything seems to work fine, so I think I am finally ready to submit EDK2 patches.

andreiw commented 5 years ago

Very cool, thanks for your hard work!

Any indication/interest from the Pi foundation on adopting UEFI and ATF in a more official manner?

A

6 дек. 2018 г., в 12:00, Pete Batard notifications@github.com написал(а):

This firmware does not run as is on all BCM283x SoCs.

Never said it would. I deliberately used BCM283x in case we expand into the Pi2 (so that we can reuse stuff from the BCM283x/ directory rather than duplicate it into a new one), but for, the firmware is only compatible with BCM2837.

I think it would be a good idea to replace all BCM2836 references with BCM2837

I'm not planning to do that at this stage, in case we reuse this code for the Pi2. I did replace all the PI2 references in the code with PI3 though.

If the EDK2 people complain, I may consider it, but until then, it seems like a waste of time.

By the way, just to provide an update, I think I now have the finalized version of what I am planning to submit in https://github.com/pbatard/RaspberryPiPkg (with HypDxe removed and some additional cleanup). If anyone feels like testing, I also uploaded 2 RPI_EFI.fd binaries, built from this and the very latest EDK2 in a v18.12.06 Release.

I tested the release binary above against a new Windows 10 1809 install (17763.107, from the retail en_windows_10_consumer_edition_version_1809_updated_sept_2018_arm64_dvd_699e6a11.iso), after replacing C:\Windows\System32\Drivers\WppRecorder.sys, and everything seems to work fine, so I think I am finally ready to submit EDK2 patches.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

pbatard commented 5 years ago

Any indication/interest from the Pi foundation on adopting UEFI and ATF in a more official manner?

I only dealt with the people who deal with trademark/design matters (and whom, I'm afraid to report from the exchanges we've had, don't seem to be very technically minded, even if they have been great help to us), and even though I tried to explain the benefits that having an UEFI firmware could bring to them, I'm not sure they quite understood what this was all about.

I think that, once we have the platform integrated in the EDK2, we should try to reach to the technical side of the Pi Foundation, to see what they have to say about it. Or, preferably, we should first work with the Debian and Ubuntu people so that their vanilla distros install properly out of the box (with all the features), and then reach out to the Pi Foundation to tell them that Pi users might get a better Linux installation experience (or at least a more "PC-like" one) through the use of the (hopefully official by then) EDK2 firmware. Anyway, that's still some time away at this stage.

On a side note, since not having networking on Windows sucks, I'm just going to point out that I opened a support case with Microchip about one week ago, to ask them if/when they were planning to include an ARM64 version of the LAN95xx & LAN97xx drivers in their OneCore package (which, as far as I understand is all we'd need to get networking on the B+).

They have finally gotten back to me today with the following:

We do have plans to release in the future ARM64 drivers. the schedule will be a business decision based on the size of the opportunities we can see.

After that they asked about the kind of application and expected number of units we had in mind, since I didn't mention what it was really for when I created the case. So I tried to bring the point home that, since this relates to running Windows 10 on potentially millions of units that have already shipped, there exists a large demand for it, and that we'd like to see these drivers be released sooner rather than later... I'll keep you posted on subsequent feedback.

pbatard commented 5 years ago

Patches to the EDK2 have been submitted!

They should appear in https://lists.01.org/pipermail/edk2-devel/2018-December/thread.html under a thread titled [PATCH v1 edk2-platfoms #/2] Platform/Broadcom: Add Raspberry Pi 3 support when the archive refreshes with latest content (and now I notice that I added a typo to 'platform' in the subject - D'oh!).

On the other hand, Microchip are driving a hard bargain with regards to the release of ARM64 network drivers. They seem to be wanting to see immediate profit to decide whether they want to release an ARM64 version in 3 months... or 12. 😭

I've done my my best to try to convince them that, due to the interest for the Pi3 as a Windows 10 platform, there is profit to be had by releasing the drivers sooner, but I'm not sure I'm making much headway. Maybe a few more of us need to get in touch with Microchip support, to make them understand that there is a real demand there...

mariobalanica commented 5 years ago

I've also contacted them, and they've replied after about 2 months:

Please find the Windows Driver on our Product Page at : http://www.microchip.com/SWLibraryWeb/product.aspx?product=OBJ-LAN95xx-WINDOWS After selecting Agree and Confirm, the Download can be selected hope this helps

I've told them once again that we need an ARM64 driver, not the driver built for Windows 10 IoT Core, and they've said:

hi , it does not look like there is such a driver available, but let me check again.

That's the end of the conversation, I guess.

The Ethernet driver would never work without a stable USB driver, which we don't have at the moment.

I've contacted MCCI (dwchsotg USB driver developers):

Sorry to say, we're not able to build this driver for ARM64; our build system needs to be updated, and as we are not funded by anybody to do this work, it's low in the queue. Wish we could get MSFT interested in supporting this.

That means we won't get any USB driver anytime soon.

As for the WLAN driver, I've sent a message to Cypress (which owns the Broadcom Wireless IoT subdivision), but no luck:

Unfortunately, something like this would not be supported here on the Cypress IoT Community. Have you approached RPi to provide the same? They are a customer of ours so they could potentially submit the request into us through their local support channel.

I highly doubt the RPi Foundation is going to do that, but it is worth a try.

pbatard commented 5 years ago

@mariobalanica, from what they told me yesterday Microchip does have plans to release an ARM64 version of their network driver. They just need to see an incentive to decide whether they are going to release it sooner (< 3 months) or later (12 months or more).

driver1998 commented 5 years ago

I guess the USB driver is on the higher priority right now, of course glad to hear that Microchip has their plan.