bouffalolab / bouffalo_sdk

BouffaloSDK is the IOT and MCU software development kit provided by the Bouffalo Lab Team, supports all the series of Bouffalo chips. Also it is the combination of bl_mcu_sdk and bl_iot_sdk
Apache License 2.0
362 stars 128 forks source link

Ox64 16Mb crashes at early init stage #92

Open gamelaster opened 1 year ago

gamelaster commented 1 year ago

I have following test firmware: gpio_input_output_bl808_m0.bin.zip

I flashed it to Ox64. After that, I read the contents of SPI Flash, contents are OK: flash.bin.zip (NOTE: Firmware is on 0x2000 offset)

But after execution, firmware fails and Illegal instruction exception is thrown. So, with CK-Link, I did this: dump binary memory ./xip-dump1.bin 0x58000000 0x58008000 The result is here: xip-dump1.bin.zip

As we can see, there is difference in 10 bytes, which can trigger invalid instruction: image (left is XIP dump, right is final binary).

After doing another dump: dump binary memory ./xip-dump2.bin 0x58000000 0x58008000: xip-dump2.bin.zip The 0x2A20 offset is again correct. (ICACHE?)

I tried to switch to DIO mode, but it did not helped. This seems to be XIP issue? Or some interference?

Sadly, since I can't properly reset the chip, I can't debug this more.

dwillmore commented 1 year ago

To save others time, the incorrect data in xip-dump1 is the data from offset 0x4a40. 0x2a20 = 0b0010101000100000 0x4a40 = 0b0100101001000000

DavidVentura commented 1 year ago

@gamelaster did you find a solution for this?

gamelaster commented 1 year ago

@DavidVentura sadly, nothing new.

wpwrak commented 1 year ago

@dwillmore, did you mean that it's from 0x2a40 (not 0x4a40) ? At least that's what I see in the hexdump: 0x2a20 = 0b0010 1010 0010 0000 0x2a40 = 0b0010 1010 0100 0000 So this looks like a single bit shift occurred in the address. This raises the question if the address is transmitted serially, and something happened with the clock or the clock-data timing, or if the address is transmitted in parallel, which would mean the corruption is caused by something else. If it's a problem in the communication between BL808 and Flash, it should be possible to see this pattern on the QSPI bus. Probably difficult to reduce this haystack (I assume it's hidden among lots of accesses) to a point were the needle can be found, though.

wpwrak commented 1 year ago

Another thought: does reading the Flash after the problem has occurred with XIP still produce the original result ? Once upon a time, we had a system with improper power-down sequencing, and the CPU was twitching while the Flash was still fully operational. Every once in a while, such a twitch resulted in a write command, which was promptly executed by the Flash ...

gamelaster commented 1 year ago

@wpwrak

Another thought: does reading the Flash after the problem has occurred with XIP still produce the original result ?

No, reading the firmware again after crash reads OK binary without bitshift.

Sadly, I didn't had time to hookup logic analyzer to the flash to check if it is also visible on the bus