Closed oscarthorn closed 1 year ago
Hey @oscarthorn Are you sure that the flashing process went well ? Please share the host and target (serial uarl) logs.
I'm seeing the following logs:
[0000.501] E> Cannot find partition bpmp-fw
[0000.505] E> Partition bpmp-fw not found
[0000.508] I> load/auth: execution failed
BPMP
is a key component.
Also, Is secure boot enabled with your device ?
Hi!
Thanks for the response!
Yes, to clarify, this specific device had been running without issue for several weeks. So unfortunately I can't share the flashing logs, we don't have them saved since there did not seem to be an issue with the device.
Yes, we have both secure boot and encryption as well as a/b partitions enabled.
Please make sure that you are using the right keys SBK
and PKC
I don't have secureboot enabled with my Jetson Xavier NX
but I will try he branch you mentioned to double everything is working as expected.
@oscarthorn Is this a new hardware module ? if so, please check the FAB
and BoardSKU
Would the keys being wrong not manifest immediately on first boot? How do I verify the keys on a system like this that does not boot, can I read it out somehow? I think the keys are correct, according to our logs it was flashed with the correct keys (and flashing it again is fine).
Thanks, though I'm not sure you will get any error, we have several dozen more units that all work fine and even the faulty units have worked fine after flashing and only latter ended up in this state, after 1-8 weeks of use roughly.
I don't think so, I would have to check which one it is exactly but it would be one of these. The board is another developer right now so I'll ask him to check.
TEGRA_BUPGEN_SPECS ?= " \
fab=100;boardsku=0000;boardrev= \
fab=200;boardsku=0000;boardrev= \
fab=300;boardsku=0000;boardrev= \
fab=301;boardsku=0000;boardrev= \
fab=100;boardsku=0001;boardrev= \
fab=200;boardsku=0001;boardrev= \
fab=300;boardsku=0001;boardrev= \
fab=301;boardsku=0001;boardrev= \
fab=200;boardsku=0003;boardrev= \
fab=300;boardsku=0003;boardrev= \
fab=301;boardsku=0003;boardrev= \
"
I don't think so, I would have to check which one it is exactly but it would be one of these. The board is another developer right now so I'll ask him to check.
TEGRA_BUPGEN_SPECS ?= " \ fab=100;boardsku=0000;boardrev= \ fab=200;boardsku=0000;boardrev= \ fab=300;boardsku=0000;boardrev= \ fab=301;boardsku=0000;boardrev= \ fab=100;boardsku=0001;boardrev= \ fab=200;boardsku=0001;boardrev= \ fab=300;boardsku=0001;boardrev= \ fab=301;boardsku=0001;boardrev= \ fab=200;boardsku=0003;boardrev= \ fab=300;boardsku=0003;boardrev= \ fab=301;boardsku=0003;boardrev= \ "
Yes, please do and let me know
Thanks, will do! I'll get back in a couple of days, he was not at home (we have a long weekend in sweden)
@ichergui This is the boardspec for one of the faulty modules: 3668-301-0003-B.0-1-2
@oscarthorn Did you figure this out?
@madisongh Yes, turns out it was this issue: https://github.com/OE4T/tegra-boot-tools/issues/20. At least we think that's the cause, a bit hard to be 100% sure. We are only using an m2 ssd and have disabled to emmc, so it was falling back to qpsi for boot related storage. We have updated tegra-boot-tools and are hoping that solves the issue.
Hi!
I'm not sure if this is a bug with meta-tegra or something hardware related but you seem quite knowledgeable, so I figured you might have an idea. Or know if it is a bug with meta-tegra.
We have some xavier nx devices running dunfell and meta-tegra @ fd63b94. Some of them, 4 so far, have suddenly stopped booting. As far as we can tell they have been running fine and then suddenly on reboot we get the included error log and they wont start. We tried flashing one of them again and it has been working fine for weeks now, so it does not seem that there is any permanent hardware damage/fault to explain it.
We have now idea what's causing it so it is hard to reproduce, and we are wondering how it can happen at all? It seems this is referring to the QSPI flash but our assumption was that it would be mostly read-only? Since it is four devices (out of 50) it does not seem to be a one off fluke.