ms-iot / imx-iotcore

Windows 10 IoT Core BSP for NXP i.MX Processors
MIT License
64 stars 42 forks source link

Completed CoreExitBootServices then hanging #20

Closed NovTechEng closed 5 years ago

NovTechEng commented 5 years ago

Hi, I am trying to get Novtech's iMX7 Meerkat96 board up and running on Win10IoT Core. I have done my best to follow the iMX porting guide, and as far as I can tell, the board boots all the way up through the CoreExitBootServices() function in the UEFI, but then crashes.

I suspect the UEFI is close to handing over control to windows but this is failing for some reason?

I was trying to track down exactly where the failure occurs but was unable to figure out what is supposed to run after CoreExitBootServices(). If anyone can give me an idea of where to look to solve this issue that would be great. If I need to supply more information please let me know.

Thanks. VerboseLog.txt Log.txt

christopherco commented 5 years ago

Hi, There was a very recent tianocore edk2 issue that causes a similar symptom (resolved in this commit: https://github.com/tianocore/edk2/commit/eb76b76218d5bac867414e2ff6dd09c6e7c700dd).

Which commit IDs are you using for tianocore/edk2 and ms-iot/imx-edk2-platforms? And can you try the latest edk2 master just to be sure you are not hitting the same issue.

NovTechEng commented 5 years ago

Alright I went ahead and pulled the latest code for both edk2 and edk2 platforms. Still seem to be having the same problem. Here are the commits I am currently using:

commit for edk2: 83463154afc699c8116a42df9184b034056c7b33 commit for edk2 platforms: 8d55f2e88e4a5dda1edbc152bdd7b937907bb9fb

NovTechEng commented 5 years ago

Correct me if I am on the wrong track, but I was searching around in the UEFI shell and found this.

Shell> bcfg boot dump -v Option: 00. Variable: Boot0000 Desc - UiApp DevPath - MemoryMapped(0xB,0x9FAC3000,0x9FD49F27)/FvFile(462CAA21-7614-4503-83 6E-8AB6F4662331) Optional- N Option: 01. Variable: Boot0001 Desc - UEFI Misc Device DevPath - VenHw(AAFB8DAA-7340-43AC-8D49-0CCE14812489,01000000)/SD(0x0) Optional- Y 00000000: 4E AC 08 81 11 9F 59 4D-85 0E E2 1A 52 2C 59 B2 N.....YM....R,Y. Option: 02. Variable: Boot0002 Desc - UEFI Shell DevPath - MemoryMapped(0xB,0x9FAC3000,0x9FD49F27)/FvFile(7C04A583-9E3E-4F1C-AD 65-E05268D0B4D1) Optional- N

Boot0001 doesn't look right to me.

christopherco commented 5 years ago

Hmm, your Boot0001 seems to be the same on our iMX7 Compulab reference.

Shell> bcfg boot dump -v
Option: 00. Variable: Boot0000
  Desc    - UiApp
  DevPath - MemoryMapped(0xB,0xBFA75000,0xBFD22867)/FvFile(462CAA21-7614-4503-83
6E-8AB6F4662331)
  Optional- N
Option: 01. Variable: Boot0001
  Desc    - UEFI Misc Device
  DevPath - VenHw(AAFB8DAA-7340-43AC-8D49-0CCE14812489,01000000)/SD(0x0)
  Optional- Y
  00000000: 4E AC 08 81 11 9F 59 4D-85 0E E2 1A 52 2C 59 B2  *N.....YM....R,Y.*
Option: 02. Variable: Boot0002
  Desc    - UEFI Misc Device 2
  DevPath - VenHw(AAFB8DAA-7340-43AC-8D49-0CCE14812489,03000000)/eMMC(0x0)
  Optional- Y
  00000000: 4E AC 08 81 11 9F 59 4D-85 0E E2 1A 52 2C 59 B2  *N.....YM....R,Y.*
Option: 03. Variable: Boot0003
  Desc    - UEFI Shell
  DevPath - MemoryMapped(0xB,0xBFA75000,0xBFD22867)/FvFile(7C04A583-9E3E-4F1C-AD
65-E05268D0B4D1)
  Optional- N

It looks like you do not have an eMMC on board. Did you make sure to configure your UEFI build to disable security during build? https://github.com/ms-iot/imx-iotcore/blob/8417fde5bf3b79cda927bfcb7d84cf5b15d337dc/build/firmware/ClSomImx7_iMX7D_1GB/Makefile#L7

Our security TA requires that RPMB is present, which requires eMMC. If missing, boot will stall.

NovTechEng commented 5 years ago

Yes I have that line in my Makefile. I was stuck at that issue for a little while but since adding that line I seem to have moved past it. Could lack of eMMC still be causing an issue?

jordanrh1 commented 5 years ago

It looks like it's loading bootmgr. Can you try attaching the boot debugger and see how far it gets in bootmgr?

Insert the SD card into your PC and enable boot debugging (where X is the drive letter of the EFIESP partition):

bcdedit /store X:\efi\microsoft\boot\bcd /bootdebug {bootmgr} on
bcdedit /store X:\efi\microsoft\boot\bcd /bootdebug {default} on

Then, start WinDBG:

windbg -k com:port=COM4,baud=115200

In WinDBG, hit Ctrl+Alt+K to cycle initial break.

Then, power on the board. If bootmgr does in fact start, you should see the debugger break in. You must set giMXPlatformTokenSpaceGuid.PcdKdUartInstance correctly in your DSC file so that Windows uses the right UART instance.

https://docs.microsoft.com/en-us/windows-hardware/drivers/devtest/bcdedit--bootdebug

NovTechEng commented 5 years ago

Alright I tried using the debugger and I am still getting no output. The debugger displays "Debuggee not connected." As far as I can tell I have everything set up correctly in the DSC file but I will attach it to this comment. Is the "Debuggee not connected" due to a mistake in my setup somehow or an indication that bootmgr is not starting?

Meerkat96_iMX7D_512M.dsc.txt

jordanrh1 commented 5 years ago

It looks like your console UART is UART6. 0x30A8000 is the base address of UART6.

  giMXPlatformTokenSpaceGuid.PcdSerialRegisterBase|0x30A80000   

It also looks like we never implemented support for PcdKdUartInstance on IMX7. It was hardcoded to use UART1, so PcdKdUartInstance has no effect. I've sent a pull request to fix this. #ms-iot/imx-edk2-platforms/pull/12

Which UART are you trying to use for the kernel debugger? The way you have it set up now, console output will go over UART6, and the kernel debugger connection will use UART1. If you're using UART6 for the console, UART1 might not be initialized correctly. We rely on u-boot to initialize the UART (set baud rate and muxing). If there's no code in u-boot to initialize UART1, it could explain why you're not seeing anything.

Since we know that UART6 is being set up correctly, I would change PcdKdUartInstance to 6. Be sure to update edk2-imx-platforms first to get #ms-iot/imx-edk2-platforms/pull/12.

Sorry for the inconvenience.

NovTechEng commented 5 years ago

Alright, that helped, now I am running everything through UART1. Now i get the following output from the debugger windbgoutput.txt

jordanrh1 commented 5 years ago

Awesome! It broke into the debugger as expected. Now type 'g' in the windbg console and see what happens. There are basically 3 possible outcomes:

  1. It runs successfully. The next thing to debug is winload.
  2. It crashes. The debugger will break in and allow you to inspect the failure.
  3. Silence. It's failing silently somewhere in bootmgr. You have to narrow down where it's failing by setting breakpoints and stepping through.
NovTechEng commented 5 years ago

It looks like a silent failure right now. I will continue looking into it. The board designer believes that the problem could be caused by the Meerkat's low amount of memory. He was wondering if there was a way he could have a call with you. @jordanrh1 Thanks

NovTechEng commented 5 years ago

Stepping through the code, It seems to crash in the next step after BOOTARM!BmMain+0x7ee

jordanrh1 commented 5 years ago

It looks like BmMain+0x7ee corresponds to a call to BmpLaunchBootEntry. The first boot entry that bootmgr launches is mobilestartup.efi. Mobilestartup.efi runs, then returns to bootmgr. Bootmgr then launches winload. Mobilestartup.efi writes data back to disk which is a potential source for errors.

Let’s try enabling boot debugging for mobilestartup.efi and see if it gets that far.

bcdedit /store k:\efi\Microsoft\boot\bcd /bootdebug {01de5a27-8705-40db-bad6-96fa5187d4a6} on

jordanrh1 commented 5 years ago

Also, it would be helpful if you could send pull requests for each of the repositories you've made changes in (u-boot, imx-edk2-platforms, optee_os, imx-iotcore).

NovTechEng commented 5 years ago

It is now crashing after mobilestartup!MobileStartupMain+0x236

NovTechEng commented 5 years ago

Ok I think it is running sucessfully through mobilestartup and winload. It appears to hang on nt

NovTechEng commented 5 years ago

The last section of my debugger output (cant seem to get past this point):

***** Path validation summary ** Response Time (ms) Location Deferred srv Symbol search path is: srv Executable search path is: Windows 10 Kernel Version 17763 MP (1 procs) Free ARM (NT) Thumb-2 Built by: 17763.107.armfre.rs5_release_svc_prod2.181026-1406 Machine Name: Kernel base = 0x81691000 PsLoadedModuleList = 0x818b2758 System Uptime: 0 days 0:00:00.000 nt!DebugService2+0x4: 816be9b0 defe __debugbreak kd> p nt!DebugService2+0x6: 816be9b2 4770 bx lr kd> g IOINIT: Built-in driver \Driver\sacdrv failed to initialize with status - 0xC0000037 KDTARGET: Refreshing KD connection

jordanrh1 commented 5 years ago

That's great, it is in fact reaching the NT kernel.

Did you make any changes or did it just work randomly? If it's random this suggests it could be an issue with the memory map.

NovTechEng commented 5 years ago

I think I am now making it this far because I recently recompiled the ffu from the most recent Compulab code (it was previously pretty out of date).

NovTechEng commented 5 years ago

Alright, I removed some entries from the acpi table. In windb I am now in nt!KiIdleLoop. Does this mean I have completed the boot process?

jordanrh1 commented 5 years ago

Possibly. What is the output of

!process 0 1
NovTechEng commented 5 years ago

here it is: https://pastebin.com/SaCZzdvh

I did cut the output it short but i figure that would be fine

jordanrh1 commented 5 years ago

Yep, looks like it booted. IotCoreDefaultApp.exe is running.

The next step is to add devices one-by-one into the ACPI tables.

If the system becomes unresponsive (unable to break in with debugger), that usually means the system has tried to access a register in a clock gated block. In U-Boot you should ungate the clocks you need.

NovTechEng commented 5 years ago

Awesome! I will go ahead and mark this issue as closed. Thanks for all the help.