ARMmbed / mbed-os

Arm Mbed OS is a platform operating system designed for the internet of things
https://mbed.com
Other
4.67k stars 2.98k forks source link

FlashIAP driver for NXP LPC55S69 causes hard fault #13113

Closed trowbridgec closed 3 years ago

trowbridgec commented 4 years ago

Description of defect

While working to use the NXP LPC55S69 in conjunction with the Pelion client on a custom board, I started by using the relevant NXP dev kit (LPCXpresso55S69). After trying to use the built-in config for the LPC55S69_NS target in the mbed-cloud-client-example application, I repeatedly found that my dev kit would hang right after booting. See example output here:

Mbed Bootloader
booti

Using the JLink Commander tool, I was able to see that it was hitting a memory hard fault:

PC = 1000026A, CycleCnt = 0DCC84F9
R0 = 3000C700, R1 = 00094200, R2 = 00000200, R3 = 6B65666C
R4 = 1400C700, R5 = 00000000, R6 = 30000800, R7 = F17ECA89
R8 = 00000000, R9 = 00000000, R10= 20024DE8, R11= 00000000
R12= 1300413B
SP(R13)= 30000800, MSP= 30000800, PSP= 30000ED0, R14(LR) = FFFFFFED
XPSR = 01050003: APSR = nzcvq, EPSR = 01000000, IPSR = 003 (HardFaultMemManage)
CFBP = 00000001, CONTROL = 00, FAULTMASK = 00, BASEPRI = 00, PRIMASK = 01

Security extension regs:
MSP_S = 30000800, MSPLIM_S = 00000000
PSP_S = 30000ED0, PSPLIM_S = 30000800
MSP_NS = 20043F88, MSPLIM_NS = 20043C00
PSP_NS = 20024DA8, PSPLIM_NS = 20023BD8
CONTROL_S = 00, FAULTMASK_S = 00, BASEPRI_S = 00, PRIMASK_S = 01
CONTROL_NS = 02, FAULTMASK_NS = 00, BASEPRI_NS = 00, PRIMASK_NS = 00

FPS0 = 00000008, FPS1 = 3E7828C0, FPS2 = 00000000, FPS3 = 00000000
FPS4 = 00000000, FPS5 = 00000000, FPS6 = 00000000, FPS7 = 00000000
FPS8 = 00000000, FPS9 = 00000000, FPS10= 00000000, FPS11= 00000000
FPS12= 00000000, FPS13= 00000000, FPS14= 00000000, FPS15= 00000008
FPS16= 00000000, FPS17= 00000000, FPS18= 00000000, FPS19= 00000000
FPS20= 00000000, FPS21= 00000000, FPS22= 00000000, FPS23= 00000000
FPS24= 00000000, FPS25= 00000000, FPS26= 00000000, FPS27= 00000000
FPS28= 00000000, FPS29= 00000000, FPS30= 00000000, FPS31= 00000000
FPSCR= 03000000

At this point, I realized that I was using a revision A2 dev kit, and knowing that the LPC55S69 is a target that Mbed/Pelion has supported for awhile, I found another dev kit with an earlier revision (A1). Using the same application hex file as the first board, the application was able to continue on further (NOTE: I did not have a network interface attached, so the application ran into other expected errors):

Mbed Bootloader
bootiþERROR: No NetworkInterface found!
Start Device Management Client
Using hardcoded Root of Trust, not suitable for production use.
Starting developer flow
Failed to load developer credentials

Factory Configurator Client [ERROR]: KCM basic functionality failed.
Resets storage to an empty state.
Using hardcoded Root of Trust, not suitable for production use.
Starting developer flow
Application ready. Build at: Jun 12 2020 15:48:38
Mbed OS version 5.15.1
mcc_platform_interface_connect()
ERROR: No NetworkInterface found!

At this point, I realized that the revision A1 board uses a revision 0A of the chip, and the revision A2 board uses a revision 1B of the chip.

After adding some printfs to the underlying Mbed sources, I was able to see that the hard fault was occurring on the rev A2 board in the initialization of the FlashIAPBlockDevice, and it seems as though it's attempting to access the internal flash in an unsupported way.

Based on this, I fired up the MCUXpresso IDE and downloaded the latest example from NXP for the flash IAP driver. This example worked as expected on the rev A2 board (rev 1B chip). The flash IAP example utilizes version 2.7.1 of the SDK for the LPC55S69, so my guess is that the SDK in Mbed simply needs updated to the latest version. This is further supported by the information in this document published by NXP: https://community.nxp.com/docs/DOC-345272

Target(s) affected by this defect ?

NXP LPC55S69

Toolchain(s) (name and version) displaying this defect ?

ARMC6

What version of Mbed-os are you using (tag or sha) ?

mbed-os-5.15.3

What version(s) of tools are you using. List all that apply (E.g. mbed-cli)

1.10.2

How is this defect reproduced ?

Build and run the mbed-cloud-client-example application configured for the LPC55S69_NS target on a revision A2 LPCXpresso55S69 dev kit (or any board with a revision 1B chip).

@maclobdell @ARMmbed/team-embeddedplanet @mmahadevan108

trowbridgec commented 4 years ago

bump

trowbridgec commented 4 years ago

@maclobdell @mmahadevan108 Any insight on this?

mmahadevan108 commented 4 years ago

I will look to update the SDK. However we also would need to update DAPLink to support the rev A2 boards. This might take longer.

trowbridgec commented 4 years ago

@mmahadevan108 thank you, that would be great! For my project, I have been using a J-Link, but yes would eventually like to see the DAPLINK support updated as well. Please let me know if there's anyway I can help!

trowbridgec commented 4 years ago

@mmahadevan108 Any updates on this issue? Are there any temporary workarounds that I could use to pull in support for the FlashIAP driver to the version of the SDK which is currently in Mbed?

0Grit commented 4 years ago

@flit

trowbridgec commented 4 years ago

Quick update from my end on this issue:

I also posted a question regarding this issue in the NXP forums (see here) and received a response from one of the NXP engineers:

As you already may considered, you could compare SDK FSL_IAP and MBED Flash IAP drivers for rev 0A. against the latest SDK driver version for rev1B . This to help to find the differences among the flash IAP interface for both revisions.

For example, the Flash addresses that the MBED driver is using

#define LPC55S69_REV0_FLASH_READ_ADDR (0x130043a3U) #define LPC55S69_REV1_FLASH_READ_ADDR (0x13007539U)

Besides this , I have not found any other hints to provide you.

I tried again to pull in the differences between the flash IAP driver (fsl_iap.c, fsl_iap.h, and fsl_iap_ffr.h) in Mbed and the latest one available from NXP, but kept hitting a hard fault with the program counter at 0x1000026A.

Output from J-Link Commander:

J-Link>h
PC = 1000026A, CycleCnt = 0562A839
R0 = 3000C700, R1 = 00094200, R2 = 00000200, R3 = 6B65666C
R4 = 1400C700, R5 = 00000000, R6 = 30000800, R7 = B17ECA89
R8 = 00000000, R9 = 00000000, R10= 20024DE8, R11= 00000000
R12= 1300413B
SP(R13)= 30000800, MSP= 30000800, PSP= 30000ED0, R14(LR) = FFFFFFED
XPSR = 01050003: APSR = nzcvq, EPSR = 01000000, IPSR = 003 (HardFaultMemManage)
CFBP = 00000001, CONTROL = 00, FAULTMASK = 00, BASEPRI = 00, PRIMASK = 01

Security extension regs:
MSP_S = 30000800, MSPLIM_S = 00000000
PSP_S = 30000ED0, PSPLIM_S = 30000800
MSP_NS = 20043F88, MSPLIM_NS = 20043C00
PSP_NS = 20024DA8, PSPLIM_NS = 20023BD8
CONTROL_S = 00, FAULTMASK_S = 00, BASEPRI_S = 00, PRIMASK_S = 01
CONTROL_NS = 02, FAULTMASK_NS = 00, BASEPRI_NS = 00, PRIMASK_NS = 00

FPS0 = 00000008, FPS1 = 3E7828C0, FPS2 = 00000000, FPS3 = 00000000
FPS4 = 00000000, FPS5 = 00000000, FPS6 = 00000000, FPS7 = 00000000
FPS8 = 00000000, FPS9 = 00000000, FPS10= 00000000, FPS11= 00000000
FPS12= 00000000, FPS13= 00000000, FPS14= 00000000, FPS15= 00000008
FPS16= 00000000, FPS17= 00000000, FPS18= 00000000, FPS19= 00000000
FPS20= 00000000, FPS21= 00000000, FPS22= 00000000, FPS23= 00000000
FPS24= 00000000, FPS25= 00000000, FPS26= 00000000, FPS27= 00000000
FPS28= 00000000, FPS29= 00000000, FPS30= 00000000, FPS31= 00000000
FPSCR= 03000000

I believe this address (and the code it's pointing to) is part of cmse_lib.o which gets linked in during a build of the LPC55S69_NS target as part of the prebuilt secure sources. I believe that this file needs to be rebuilt with the update SDK.

ciarmcom commented 4 years ago

Thank you for raising this detailed GitHub issue. I am now notifying our internal issue triagers. Internal Jira reference: https://jira.arm.com/browse/IOTOSM-2293

0Grit commented 3 years ago

@mmahadevan108

PennRobotics commented 3 years ago

I can guess what is happening.

PC = 1000026A, CycleCnt = 0562A839
R0 = 3000C700, R1 = 00094200, R2 = 00000200, R3 = 6B65666C
R4 = 1400C700, R5 = 00000000, R6 = 30000800, R7 = B17ECA89
R8 = 00000000, R9 = 00000000, R10= 20024DE8, R11= 00000000
R12= 1300413B
SP(R13)= 30000800, MSP= 30000800, PSP= 30000ED0, R14(LR) = FFFFFFED
XPSR = 01050003: APSR = nzcvq, EPSR = 01000000, IPSR = 003 (HardFaultMemManage)
CFBP = 00000001, CONTROL = 00, FAULTMASK = 00, BASEPRI = 00, PRIMASK = 01

Flash_Erase is the name of the function at 0x1300413B (R12) being called with arguments r0 = config, r1 = start address, r2 = length, r3 = erase key (in ASCII "lfek"). The start address isn't one of the reserved addresses although it is pretty close. It's possible this could be the NSC/veneer region? I would check the security attribution unit to see the access rights at 0x94200.

You found basically the same thing as I had. There is a talk online where the flash functions for this target seemed to be at a different ROM address. On my board, Flash_Write is at 0x1300419D. On theirs? 0x13007310. It's also possible one is a wrapper for the other, although it looks like fsl_iap has hardcoded addresses and memory-stored addresses for flash functions---the correct one selected based on version.

As far as workaround... no idea. I have the earlier revision of the board. Since you are writing from secure memory (by IDAU) using a secure ROM function into a possibly nonsecure region (by IDAU, no idea what the SAU says), I could imagine this would cause a memory error.

One other detail, the CPU frequency I believe goes to 150 MHz on the new revision, but flash ops are still limited to under 100 MHz. I imagine that is another potential error source.

A colleague suggests data alignment could also be an issue with this specific board, but I'm not sure what he means. It was mumbling about 512-byte and 16-byte writes and 4-byte reads. Colleagues can be confusing, but he's generally knowledgeable.

There is one peripheral register (probably in the flash controller e.g. STARTA... but I cannot remember exactly which one) that I was able to set via the debug port but not set from a running binary. Thus, I had to use the ROM functions rather than the flash subsystem directly. So using the flash controller from SRAM as a workaround might not work as it normally would on older Cortex-M devices.

And then there's TrustedFirmware-M, which should build on both revisions of this target. If so, you could check if Flash_Erase is called and what is different about their implementation.

If there's enough interest in this issue, I have a new revision 55S69 arriving next week and can step through each board's mbed-os build until the memfault occurs. Until then, I'm just guessing and cursing NXP's documentation and implementation.

ciarmcom commented 3 years ago

We closed this issue because it has been inactive for quite some time and we believe it to be low priority. If you think that the priority should be higher, then please reopen with your justification for increasing the priority.