OE4T / meta-tegra

BSP layer for NVIDIA Jetson platforms, based on L4T
MIT License
389 stars 219 forks source link

[jetson-tx2] Flashing fails, is it due to new FAB new boards have? #478

Closed manuel-wagesreither closed 3 years ago

manuel-wagesreither commented 3 years ago

Hello everyone,

we've received a new batch of Jetson TX2 SOMs which we are unable to flash with our Yocto-created distribution based on zeus. We are on 596c4adc1e of meta-tegra, but as far as this is concerned it should be functionally equal to current zeus-l4t-r32.3.1.

These are the last few lines of ./doflash.sh output:

[   7.7424 ] Sending bootloader and pre-requisite binaries
[   7.7437 ] tegrarcm_v2 --download blob blob.bin
[   7.7445 ] Applet version 01.00.0000
[   8.0141 ] Sending blob
[   8.0144 ] [................................................] 100%
[   8.4907 ] 
[   8.4924 ] tegrarcm_v2 --boot recovery
[   8.4940 ] Applet version 01.00.0000
[   8.7636 ] 
[   9.7663 ] tegrarcm_v2 --isapplet
[  10.4482 ] 
[  10.4499 ] tegradevflash_v2 --iscpubl
[  10.4514 ] Cannot Open USB
[  10.7860 ] 
[  11.7889 ] tegrarcm_v2 --isapplet

The script doesn't return and just gets stuck then.

The failing SOMs have FAB=D02 (whatever that means); while our existing SOMs have FAB=D00.

$ cat jetson-tx2_bootblob_ver.txt 
NV3
# R32 , REVISION: 3.1
BOARDID=3310 BOARDSKU=1000 FAB=D02
20201105103735
BYTES:76 CRC32:32490418

In an attempt to add compatibility for this new boards, we made the following changes, which didn't help.

diff --git a/conf/machine/jetson-tx2.conf b/conf/machine/jetson-tx2.conf
index 251b623..ddc7b64 100644
--- a/conf/machine/jetson-tx2.conf
+++ b/conf/machine/jetson-tx2.conf
@@ -29,4 +29,4 @@ TEGRA_BOARDSKU ?= ""
 TEGRA_BOARDREV ?= ""
 TEGRA_CHIPREV ?= "0"
 # Extracted from l4t_generate_soc_bup.sh for BOARDID=3310 and board=jetson-tx2
-TEGRA_BUPGEN_SPECS ?= "fab=B00 fab=B02 fab=C04 fab=D00 fab=D01"
+TEGRA_BUPGEN_SPECS ?= "fab=B00 fab=B02 fab=C04 fab=D00 fab=D01 fab=D02"

Can someone shed some light on this?

One of our engineers assumed the bootloader which doflash.sh pushed onto the board might not run there due to incompatibility or wrong signature. Is he correct? Which other steps would one need to do to add compatibility with these new boards?

Are we even on the right track here?

Thank you a lot, Manuel

ichergui commented 3 years ago

Hi @manuel-wagesreither

Could you please share the steps you are using here to build and images signing ? Could you please also share the content of /etc/nv_boot_control.conf, you can use the following command ?

$ sudo cat /etc/nv_boot_control.conf
madisongh commented 3 years ago

Also - can you flash and boot the target device with stock L4T R32.3.1? If you can't use regular L4T due to having a custom carrier, can you try putting a D02 module in a Jetson TX2 development kit and try stock L4T there?

In addition to what @ichergui mentioned, it would also help to see the full log of your doflash.sh session plus the output captured on the serial console of the TX2 during the flashing process.

dwalkes commented 3 years ago

Hi @manuel-wagesreither I'm working through a similar but slightly different issue on the NVIDIA forum right now, see this response referencing the PCN at https://developer.nvidia.com/jetson-tx2-pcn-206440-dramemmc-public. To support D02 SOMs you need JP 4.4, R32.4.2 bootloader files:

In order to support the new Micron DRAM and Hynix eMMC, the software image flashed to the Jetson TX2 must include:
• Appropriate BCT and DVFS changes required by the Micron memory device
• Updated bootloader implementing updated OCR register polling timeout per JEDEC specification
The following releases of Linux for Tegra (L4T) include the necessary changes:
• JetPack 4.4 / BSP 32.4.2 (or later)
• JetPack 3.3.3 / BSP 28.4 (or later)
manuel-wagesreither commented 3 years ago

Thank you all for being of help here!

@ichergui We're using Mender as OTA update solution, so our build workflow consists of calling the mender wrapper source setup-environment tegra followed by the standard bitbake <image-name> command.

We are signing our Mender update artifacts, but not the images itself.

@madisongh We'll do as you suggested if updating to the newer Jetpack release doesn't solve our problems.

@dwalkes This looks like a perfect match! We'll give this a try.

manuel-wagesreither commented 3 years ago

@ichergui I missed your question regarding /etc/nv_boot_control.conf.

In the generated rootfs this file is missing for some reason.

manuel@debian:~/vps/repos/<repo>/build/tmp/work/jetson_tx2-poky-linux/<image>/1.0-r0/rootfs$ ls -l etc/ | grep nv_boot_control.conf
lrwxrwxrwx 1 manuel manuel    40 Okt  1 01:08 nv_boot_control.conf -> /var/lib/nvbootctrl/nv_boot_control.conf
manuel@debian:~/vps/repos/yoc/<repo>/build/tmp/work/jetson_tx2-poky-linux/<image>/1.0-r0/rootfs$ ls -l var/lib/nvbootctrl/
total 0

At a flashed board, the content of this file is:

jetson-tx2:/etc$ cat nv_boot_control.conf 
TNSPEC 3310-B02-1000-C.0-1-0-jetson-tx2-mmcblk0p1
TEGRA_CHIPID 0x18
TEGRA_OTA_BOOT_DEVICE /dev/mmcblk0boot0
TEGRA_OTA_GPT_DEVICE /dev/mmcblk0boot1

I assume this board is of FAB=D00, but its mounted inside an embedded device, so I can't look it up that easily.

ichergui commented 3 years ago

@manuel-wagesreither

No problem, the file is not missing, it will be auto generated via systemd service called setup-nv-boot-control.service based on a template file /etc/nv_boot_control.template and tool tegra-boardspec.

I don't have here a Jetson TX2 with the same FAB as you. I can not test it.

Did you succeed to flash you device by using a stock L4T as @madisongh suggest ? if not, please try it and let us know if you got any issues. I suggest to try with the DevKit.

manuel-wagesreither commented 3 years ago

@ichergui

Got it, thanks for the explanation.

Due to COVID-19 I'm working from home and won't have access to an development kit until next week. Same for a Jetson TX2 with FAB D02. I forwarded this to a colleague of mine who did the initial setup of our system. He might support a bit.

@madisongh

Also - can you flash and boot the target device with stock L4T R32.3.1? If you can't use regular L4T due to having a custom carrier, can you try putting a D02 module in a Jetson TX2 development kit and try stock L4T there?

I joined this team when Yocto was already up and running, so I'm lacking some basics of using the Jetson TX2 in a standalone way. That is, when it is not integrated in our product. If I understand the NVIDIA website right, I using the SDK and following this guide would be a feasible way to do this, right?

In addition to what @ichergui mentioned, it would also help to see the full log of your doflash.sh session plus the output captured on the serial console of the TX2 during the flashing process.

Will do that.

madisongh commented 3 years ago

If I understand the NVIDIA website right, I using the SDK and following this guide would be a feasible way to do this, right?

Yes, that should do it.

manuel-wagesreither commented 3 years ago

Are you interested in the D02 logs for further development of meta-tegra? If yes, I will do the steps as requested. If not, I would skip it, as we're reasonably convinced that @dwalkes is right and our outdated Jetpack version is the root cause.

Here's a link to PCN 206440 which contains detailed information on this.

When trying to doflash.shing the D02 SOM in our custom carrier board with a yocto image which has l4t-r32.3.1, the following appears on the serial port:

[0115.304] E> Waypoint-0.5 ACK pending: 0x8
[0115.308] C> MTS error (2) : dram alias check failure
[0115.313] C> cpu waypoint 0.5 failed
[0115.317] C> ERROR: Highest Layer Module = 0x32, Lowest Layer Module = 0x32,
Aux Info = 0x1, Reason = 0x6

The PCN says this hardware version changed the DRAM, so as far as I'm concerned, this is proof enough. I would close this ticket if there is nothing from your side.

Thank you everyone and @dwalkes in particular!

madisongh commented 3 years ago

we're reasonably convinced that our outdated Jetpack version is the root cause.

Sounds that way, based on what's in the PCN. You'll need at least L4T R32.4.2 for the firmware and config files necessary to support the new DRAM.

And it looks like I need to update the TEGRA_BUPGEN_SPECS setting in the jetson-tx2 machine config file to add an entry for the D02 FAB.

manuel-wagesreither commented 3 years ago

Do you intend to add support for L4T R32.4.2 to zeus?

madisongh commented 3 years ago

We have R32.4.3 support on the dunfell-l4t-r32.4.3 branch (R32.4.2 was a fairly short-lived developer preview release). I don't have any plans for backports to any older branches at this point.

manuel-wagesreither commented 3 years ago

Alright, so an update to dunfell it is.

Appreciate your work on meta-tegra! Thank you!