centipeda / balena-zynq7000

BalenaOS build for Zynq devices.
0 stars 0 forks source link

Building BOOT.bin for the Mars PM3 #1

Open centipeda opened 4 years ago

centipeda commented 4 years ago

I've tentatively decided to try to document issues with this project on this repository, so they're easily accessible to anyone who wants to read about them.

Issue Summary

The gist: I can't seem to produce a BOOT.bin -- the file that the Mars PM3 expects to see in its boot partition to boot successfully -- that the Mars will boot with.

Details

The BOOT.bin file is the Mars's preferred (as far as I can tell) boot image. It's generated with Xilinx's bootgen tool, which can be invoked from the Vivado design tool, or as a [command line tool](). The user guide for the 2019.2 version is available here.

Bootgen takes the files used to boot and packages them into a single file binary file, according to a configuration file usually named with the *.bif extension. Most of this work in this build is handled by the xilinx-bootbin package provided by the meta-xilinx-tools layer, so I'll concentrate on the files that are packed into BOOT.bin.

The last working BOOT.bin I have was generated with PetaLinux. There were three files in the .bif configuration file:

Kernel panic: not syncing - attempted to kill init!

Of note is this line:

This means that you are lacking the Balena u-boot integration patches. If you do have these patches applied then it means that the u-boot doesn’t run the Balena commands.

What I took away from this is that Balena makes certain patches to U-Boot that it won't work without, so it's not possible to take a prebuilt u-boot.elf file from another source. I think this is supported by this line in the u-boot.bbappend file in the balena-raspberrypi repository (the official Balena repository for the Raspberry Pi):

inherit u-boot-resin

since this means there are some U-Boot changes made by a recipe in Balena... somewhere. On second thought, I should definitely check whether these changes are Raspberry Pi-specific, not being applied to the right packages, or even breaking the build for the Zynq.

In any case, since the build also generates a u-boot.elf file, I used this to create a new BOOT.bin, but the Mars won't boot -- there's no output on the serial line, even though the light on the Zynq seems to be flashing in an aspirational way. That's where I'm currently stuck, since I can't tell why the boot is failing -- not even U-Boot seems to run, but that may just be the serial line not being configured correctly.

The next move on this issue may be to revisit the original kernel panic I was getting, and to see if I can find the root of that issue, or if there are some details with the FSBL that change.

centipeda commented 4 years ago

Balena does indeed apply special patches to U-Boot here in meta-balena-common, so it looks like these need to be applied for it to work properly. These changes live in a u-boot_%.bbappend, though, so they should be applied when u-boot is built by Yocto anyway.

Checking on the build again, it looks like building resin-image isn't also actively building u-boot.elf, which makes me think that Yocto isn't seeing U-Boot as a dependency of Balena.

Trying to directly do

bitbake u-boot

results in

ERROR: Nothing PROVIDES 'u-boot'
u-boot was skipped: PREFERRED_PROVIDER_virtual/bootloader set to u-boot-xlnx, not u-boot     

I think I assumed u-boot-xlnx would just resolve down to u-boot as the build went on, but maybe that's not the case and PREFERRED_PROVIDER_virtual/bootloader needs to be set to u-boot in order for Balena to apply its patches properly.

centipeda commented 4 years ago

More oddities:

It looks like the Yocto packages u-boot and u-boot-xlnx are entirely different. The former is provided by Poky, while the latter is maintained by Xilinx and uses an entirely different set of U-Boot source code. While either can be used as a provider for PREFERRED_PROVIDER_virtual/bootloader, they're different sets of source code. I assume there is a good reason Xilinx maintains its own copy of the U-Boot source, possibly something to do with integrating the Zynq's FPGA at a low level, so it might be a bad idea to just try to use one or the other.

The problem looks to be that the patches made by Balena mentioned earlier are made to the u-boot package (which makes sense), but u-boot-xlnx is the package that actually provides a file named u-boot.elf. Thus, the Balena patches were never applied, since PREFERRED_PROVIDER_virtual/bootloader has been set to u-boot-xlnx in marspm3-zynq7.conf since it's been created, for the most part.

The question here, then, is why there's no serial output on the Mars even though the build is effectively using the Xilinx U-Boot code without any modification. My guess is that the device tree isn't being properly sent to U-Boot during compilation -- i.e., the device node that tells U-Boot where the serial port is isn't being set, so it doesn't know to communicate on that port. The next step will probably be investigating u-boot-xlnx_%.bbappend in meta-marspm3-bsp to see if I'm correctly setting the device tree.

centipeda commented 4 years ago

Verified the FSBL (again) with a working PetaLinux build for the Mars. Important issue -- the PetaLinux build would not succeed with the latest revision of the Xilinx SDK: the build required revisi on 2017.4. Next step: change the branches for meta-xilinx and meta-xilinx-tools to v2017.4.

centipeda commented 4 years ago

After trying this out... it looks like the old releases of meta-xilinx* layers were also based off of old releases of the Yocto source. Downgrading to these versions would require downgrading Yocto to an older version... which won't work, because it would break compatibility with other layers, most notably meta-balena itself.

I have determined that I probably wasn't setting the device tree correctly after all. There are two places the device tree needs to be set -- one for the kernel, which is probably set by KERNEL_DEVICETREE, and one for U-Boot, so we can get serial output when it runs through U-Boot, making it infinitely easier to debug Balena itself. As far as I can tell, the latter can be accomplished by patching the U-Boot source itself, as recommended here. The other device tree files look to be included in arch/arm/dts in the Xilinx U-Boot source, so that would probably be the place to include them.

My only hangup is that I'm unclear on how exactly the entire device tree is generated wtihin PetaLinux, since it seems to have its own method involving different sets of files. I have a system-user.dtsi which contains a few definitions needed to boot, but it also references a system-conf.dtsi, and system-user.dtsi is itself referenced by another device tree file in the PetaLinux build.

It might be the case that the entire tree is saved to a single place somewhere, or that some of this behavior is also in U-Boot. The meta-xilinx device-tree recipe seems like it could be helpful, but that might just be for the kernel device tree. More investigation required.

centipeda commented 4 years ago

After boiling down the problem some more, I decided to remove all parts of the build that didn't directly relate to the task -- building u-boot.elf and tried to just build a working copy of the u-boot.elf from the U-Boot source code, then compile it into a BOOT.BIN file with the Xilinx bootgen tool. If that succeeded, I would attempt to incorporate this process into the Yocto build so the BalenaOS patches could be incorporated.

This took me down a relatively lengthy path, but the upshot is this: I realized that a working copy of U-Boot for the Enclustra Mars modules already existed -- Enclustra maintains a copy in their repository right on GitHub. When I used that copy of the U-Boot source, I was able to build a copy of u-boot.elf, build it into a BOOT.BIN file, and boot the Mars PM3 entirely outside of Petalinux.

I'm now working on incorporating this into a new recipe called u-boot-enclustra in the Yocto build, which uses the Enclustra U-Boot source instead of the upstream U-Boot source or Xilinx's u-boot-xlnx. The current issue is that the BalenaOS patches fail to apply to the Enclustra U-Boot source.

I'll record the long version of this in case someone (or me) almost takes this trip again:

Building U-Boot in the way I described, from the U-Boot source itself (more specifically, Xilinx's copy on GitHub) would mean porting U-Boot to new hardware. I found a guide to the basics of this process at this link. Among other things, it would require defining:

This is a lot, especially for someone without a full understanding of U-Boot, but the idea was that the working Petalinux build should necessarily have the information required to successfully build U-Boot. I was able to find some of this information by looking at the Xilinx guide to building U-Boot for Zynq -- in a Petalinux build, the device tree blob was stored in

<plnx-proj-root>/components/plnx_workspace/device-tree/device-tree-generation/plnx_arm-system.dtb

which could be decompiled into a device tree source file with dtc. That still left the board_defconfig file and all the header files, but I just attempted to use the default Zynq header files and definitions instead. This did not work, likely for a variety of reasons I do not understand.

After spending a while attempting to copy the right hardware header files from the Petalinux build, I tried to see if anyone had already created a working U-Boot configuration for the Mars modules.

First I ran into this guide which showed how to build for the Mars ZX3/PM3 (which I figured was close enough for the attempt) that included its own copy of the U-Boot source with a set of configuration files defined for the PM3/ZX3. I couldn't get the build to succeed, though, due to (what I think were) gcc versioning errors. The last update to the code was also in 2013, and I wasn't sure if the U-Boot source had changed enough in that time to make the BalenaOS patches difficult to apply.

At this point, I'd passed over Enclustra's source code for the Yocto build, since they only provided the Enclustra Build Environment for building against their hardware, which seemed mostly incompatible with Yocto, and Enclustra support didn't have a lot in the way of assistance the last time I emailed them about help with a Yocto build. They do, however, have a U-Boot source repo that has a working configuration for the Mars ZX2 (I don't know why it took me this long to consider this).

After cloning and entering this repo, I was able to do

export ARCH=arm
export CROSS_COMPILE=arm-linux-gnueabi-

to tell U-Boot to look for sources for ARM-architecture boards and to compile with the ARM cross-compiler, respectively. Then, I searched in the configs/ directory for configuration files that represented boards that this copy of U-Boot knew how to build for. I found a file named zynq_mars_zx2_defconfig, and decided to try to build with that configuration. I then did

make zynq_mars_zx2_defconfig

to create the .config file needed to prepare U-Boot to build for the Mars. (I believe make zynq_mars_zx2_config works exactly the same way, and this is the way it's used in most of the Yocto recipes I've seen, where the U-Boot recipe sets the UBOOT_MACHINE variable to use $board_config instead of $board_defconfig.)

Finally, I ran

make

to build the u-boot.elf file.

From there, I used the Xilinx bootgen tool to create a BOOT.BIN, as described in the initial issue description, and loaded it onto an SD card on the board, which I was monitoring for serial output. Success!

U-Boot 2019.01-g19f66deb68 (May 27 2020 - 05:17:51 -0400)

DRAM:  ECC disabled 512 MiB
MMC:   mmc@e0100000: 0
In:    serial@e0000000
Out:   serial@e0000000
Err:   serial@e0000000
SF: Detected s25fl512s_256k with page size 256 Bytes, erase size 256 KiB, total 64 MiB
Net:   ZYNQ GEM: e000b000, phyaddr 3, interface rgmii-id
eth0: ethernet@e000b000
Hit any key to stop autoboot:  0
** Unable to read file uboot.scr **
zynq-uboot>

The rest of the boot fails because it's configured slightly differently from the Petalinux BOOT.BIN which I swapped out, but I can now properly debug now that I have serial output.

The next step is to incorporate this process into the Yocto build, so that the patches that BalenaOS makes to U-Boot are applied properly, and so the initial problem this issue describes can be solved.

centipeda commented 4 years ago

To make our Yocto build use Enclustra's U-Boot source code instead of Xilinx's, I decided to create an entirely new u-boot-* recipe.

I began by creating a new u-boot-enclustra.bb recipe file in the meta-marspm3 layer. Most of the information in it is derived from the Xilinx u-boot-xlnx recipe, with the exception of the SRC_URI variable, which is changed to the enclustra-bsp/xilinx-uboot repo. I can then directly build the new U-Boot recipe with bitbake by issuing

bitbake u-boot-enclustra

The build fails, however, with this error:

ERROR: u-boot-enclustra-1.0-r0 do_patch: Command Error: 'quilt --quiltrc /home/jcepeda/work/balena-zynq/build/tmp/work/marspm3_zynq7-poky-linux-gnueabi/u-boot-enclustra/1.0-r0/recipe-sysroot-native/etc/quiltrc push' exited with 0  Output:
Applying patch resin-specific-env-integration-kconfig.patch
patching file include/env_default.h
Hunk #1 succeeded at 9 with fuzz 2 (offset -1 lines).
Hunk #2 FAILED at 24.
1 out of 2 hunks FAILED -- rejects in file include/env_default.h
Patch resin-specific-env-integration-kconfig.patch does not apply (enforce with -f)               

This failure is definitely the U-Boot patch that the balenaForum thread above refers to, which is here in the meta-balena layer. This means we're definitely closer to building a boot image that works with Balena, since the base u-boot.elf compiles correctly. The next step is to investigate this error.

centipeda commented 4 years ago

Got a fix, but not a permanent one. The BalenaOS patch is about a line too far off from the U-Boot source for patch to be able to find it. Adding --fuzz 3 to QUILT_DIFF_OPTS in the build directory for the u-boot-enclustra recipe (if the failure happens, the log should be around the right place) lets the patch apply, but with a warning about fuzz. The recipe succeeds, though, and the boot image does let the board boot up.

I could try to figure out how to make this configuration change through Yocto, but I'm not sure if increasing the fuzz level is a permanent solution. It might be something to open an issue in meta-balena for, but the most recent version of U-Boot should probably be checked first, in case the patch works there but not with the v2019.01 version that is being used for the Mars module.

centipeda commented 4 years ago

Next error: after loading the uImage kernel image generated by linux-xlnx into U-Boot, I get a failure with this error:

## Booting kernel from Legacy Image at 02080000 ...
   Image Name:   Linux-4.19.0-xilinx-v2019.1
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    4503552 Bytes = 4.3 MiB
   Load Address: 00008000
   Entry Point:  00008000
   Verifying Checksum ... OK
   Loading Kernel Image ... OK
FDT and ATAGS support not compiled in - hanging
### ERROR ### Please RESET the board ###

Not sure what this is caused by, since the kernel configuration option to enable FDT and ATAGS is support is enabled in the current build. A cleaned build is in progress to see if this is just caused by a sstate error.