stm32-rs / stm32h7xx-hal

Peripheral access API for STM32H7 series microcontrollers
BSD Zero Clause License
215 stars 102 forks source link

Cannot get past reset with the blinky example. #263

Closed matoushybl closed 2 years ago

matoushybl commented 2 years ago

Hi, I am trying to start developing for the STM32H743, using the weact H7 board (https://github.com/WeActTC/MiniSTM32H7xx) but I am having trouble even with the blinky example. From my understanding, the program panics during the initialization phase and goes to a fault.

I am compiling the example like this:

cargo build --example blinky --features="stm32h743 rt log-semihost example-ldo"

Then I run the gdb:

arm-none-eabi-gdb -q target/thumbv7em-none-eabihf/debug/examples/blinky
Reading symbols from target/thumbv7em-none-eabihf/debug/examples/blinky...
(gdb) target extended-remote :3333
Remote debugging using :3333
0x080002aa in Reset ()
(gdb) monitor arm semihosting enable
semihosting is enabled

(gdb) break main
Breakpoint 1 at 0x8001574: file examples/blinky.rs, line 14.
Note: automatically using hardware breakpoints for read-only addresses.
(gdb) next
Single stepping until exit from function Reset,
which has no line number information.
halted: PC: 0x080002ac
halted: PC: 0x080002ae
halted: PC: 0x080002b0
halted: PC: 0x080002aa
halted: PC: 0x080002ac
halted: PC: 0x080002ae
halted: PC: 0x080002b0
halted: PC: 0x080002aa
halted: PC: 0x080002ac
halted: PC: 0x080002ae
halted: PC: 0x080002b0
halted: PC: 0x080002aa
halted: PC: 0x080002ac

The halted: PC ... messages keep going on forever.

Could you please help me with this issue? Thanks!

mlamoore commented 2 years ago

Just a quick guess, do you need the feature stm32h743v instead of stm32h743?

ST made some major changes with revision V, so only the first 10 months or so of production would use the original feature stm32h743, all newer parts need stm32h743v instead.

The rev V parts have a V printed near the end of the part number, check the datasheet for how to tell the difference for your package.

On Mon, Oct 18, 2021, 3:26 AM Matous Hybl @.***> wrote:

Hi, I am trying to start developing for the STM32H743, using the weact H7 board (https://github.com/WeActTC/MiniSTM32H7xx) but I am having trouble even with the blinky example. From my understanding, the program panics during the initialization phase and goes to a fault.

I am compiling the example like this:

cargo build --example blinky --features="stm32h743 rt log-semihost example-ldo"

Then I run the gdb:

arm-none-eabi-gdb -q target/thumbv7em-none-eabihf/debug/examples/blinky Reading symbols from target/thumbv7em-none-eabihf/debug/examples/blinky... (gdb) target extended-remote :3333 Remote debugging using :3333 0x080002aa in Reset () (gdb) monitor arm semihosting enable semihosting is enabled

(gdb) break main Breakpoint 1 at 0x8001574: file examples/blinky.rs, line 14. Note: automatically using hardware breakpoints for read-only addresses. (gdb) next Single stepping until exit from function Reset, which has no line number information. halted: PC: 0x080002ac halted: PC: 0x080002ae halted: PC: 0x080002b0 halted: PC: 0x080002aa halted: PC: 0x080002ac halted: PC: 0x080002ae halted: PC: 0x080002b0 halted: PC: 0x080002aa halted: PC: 0x080002ac halted: PC: 0x080002ae halted: PC: 0x080002b0 halted: PC: 0x080002aa halted: PC: 0x080002ac

The halted: PC ... messages keep going on forever.

Could you please help me with this issue? Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stm32-rs/stm32h7xx-hal/issues/263, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAUS5FPCKJZAIS3PXGJPCADUHPK3HANCNFSM5GGAV2PQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

matoushybl commented 2 years ago

Thanks! I have just now tried it with the feature stm32h743v feature and the results are pretty much the same. weird thing is that it still uses the looping panic behavior instead of panic-semihosting.

Will investigate further, but I will be glad for any idea how to debug this.

mattico commented 2 years ago

Try using step instruction si from the reset vector until it gets into the infinite loop, and try disassembling the code that gets run.

Which version of stm32h7xx-hal are you using?

matoushybl commented 2 years ago

Hi! I am using the current master. By reading the assembly and stepping using si, I figured out, that the program loops at the beginning of the Reset handler. More specifically addresses 02aa, 02ac, 02ae, 02b0 in the following assembly output:

08000298 <Reset>:
 8000298:       f04f 34ff       mov.w   r4, #4294967295 ; 0xffffffff
 800029c:       46a6            mov     lr, r4
 800029e:       f00d fe5b       bl      800df58 <DefaultPreInit>
 80002a2:       46a6            mov     lr, r4
 80002a4:       480d            ldr     r0, [pc, #52]   ; (80002dc <Reset+0x44>)
 80002a6:       490e            ldr     r1, [pc, #56]   ; (80002e0 <Reset+0x48>)
 80002a8:       2200            movs    r2, #0
 80002aa:       4281            cmp     r1, r0
 80002ac:       d001            beq.n   80002b2 <Reset+0x1a>
 80002ae:       c004            stmia   r0!, {r2}
 80002b0:       e7fb            b.n     80002aa <Reset+0x12>
 80002b2:       480c            ldr     r0, [pc, #48]   ; (80002e4 <Reset+0x4c>)
 80002b4:       490c            ldr     r1, [pc, #48]   ; (80002e8 <Reset+0x50>)
 80002b6:       4a0d            ldr     r2, [pc, #52]   ; (80002ec <Reset+0x54>)
 80002b8:       4281            cmp     r1, r0
 80002ba:       d002            beq.n   80002c2 <Reset+0x2a>
 80002bc:       ca08            ldmia   r2!, {r3}
 80002be:       c008            stmia   r0!, {r3}
 80002c0:       e7fa            b.n     80002b8 <Reset+0x20>
 80002c2:       480b            ldr     r0, [pc, #44]   ; (80002f0 <Reset+0x58>)
 80002c4:       f44f 0170       mov.w   r1, #15728640   ; 0xf00000
 80002c8:       6802            ldr     r2, [r0, #0]
 80002ca:       430a            orrs    r2, r1
 80002cc:       6002            str     r2, [r0, #0]
 80002ce:       f3bf 8f4f       dsb     sy
 80002d2:       f3bf 8f6f       isb     sy
 80002d6:       f001 f94b       bl      8001570 <main>

which means that it never even reaches the main function. I am currently looking for a tool that would help me analyze the assembly as I am too lazy to do it by hand and to figure out what is going on. Thanks for the help!

mattico commented 2 years ago

That reset handler is from cortex-m-rt, which has the assembly pretty well commented: https://github.com/rust-embedded/cortex-m-rt/blob/e48af90b9bb4ce38762b1ee134c9b4991fd605b0/asm.S#L40-L79

Looks like you're stuck in the loop that copies the .bss data. Some suggestions:

  1. Check your linker script. Make sure the addresses and sizes are correct for your memories. (memory.x)
  2. Check your binaries. See how large your sections are and where the .bss start and end symbols are. (objdump, readelf, etc.)
  3. I vaguely remember one time that having breakpoints set made gdb execute instructions so slowly that reset would take a very long time to complete. (Not sure how long because I gave up waiting) Try running without any breakpoints set.
  4. Try building with --release to shrink the binary size (can still enable debuginfo as its not copied to the target).
matoushybl commented 2 years ago

Thanks for the suggestions, I will try them tomorrow. I am using the linker script provided by the crate, but I am now realizing that there is no build script that would actually tell lld to use the memory.x.

Given that there is no build script, is there a preferred way of running the examples?

cargo size reports the following memory mappings:

blinky  :
section                size        addr
.vector_table           664   0x8000000
.text                 52112   0x8000298
.rodata                3768   0x800ce30
.data                     8  0x20000000
.gnu.sgstubs              0   0x800dd00
.bss                     12  0x20000008
.sram4                    0  0x38000000
.sram3                    0  0x30040000
.axisram                  0  0x24000000
.uninit                   0  0x20000014
adamgreig commented 2 years ago

Since your sectors are all in the right place, it seems the memory.x file is indeed being used -- usually a .cargo/config.toml file sets a RUSTFLAGS argument to tell the linker to use link.x, which is provided by cortex-m-rt, and link.x then includes memory.x. The linker searches the project directory for memory.x, so even without a build script it will usually find it, but sometimes a build script is used to copy memory.x into OUT_DIR, for example to select/customise it or because you're using a workspace where each crate needs a different memory.x.

matoushybl commented 2 years ago

Thanks for the explanation, now I understand.

So after some debugging I figured out what the problem is, however I do not know the correct fix.

When utilizing the memory.x script provided, the Reset assembly gets stuck in bss initialization because the __ebss is incorrectly set to address 0x24000000, which corresponds to the AXISRAM. That can be seen in the output of cargo size and in the corresponding r1 register.

blinky  :
section                size        addr
.vector_table           664   0x8000000
.text                 52060   0x8000298
.rodata                4088   0x800ce00
.data                     8  0x20000000
.gnu.sgstubs              0   0x800de00
.bss                     12  0x20000008
.axisram                  0  0x24000000
.sram3                    0  0x30040000
.sram4                    0  0x38000000
.uninit                   0  0x20000014
(gdb) info reg
r0             0x20000010          536870928
r1             0x24000000          603979776

When I utilized a simple memory.x script from embassy, the __ebss gets correctly set to the start of the .uninit section, the zeroing loop is short and the device boots without any problems.

MEMORY
{
    FLASH : ORIGIN = 0x8000000, LENGTH = 2097152
    RAM : ORIGIN = 0x24000000, LENGTH = 524288
}
blinky  :
section                size        addr
.vector_table           664   0x8000000
.text                 52060   0x8000298
.rodata                4088   0x800ce00
.data                     8  0x24000000
.gnu.sgstubs              0   0x800de00
.bss                     12  0x24000008
.uninit                   0  0x24000014
(gdb) info reg
r0             0x24000008          603979784
r1             0x24000014          603979796

I am not skilled enough to know what the root of this problem is, but I believe that this is a bug, maybe in cortex-m-rt.

Thanks to everyone who helped me and if you've got any suggestions howto solve this, or what to try, let me know, I can try to debug it further.

mattico commented 2 years ago

Yeah the linker script in cortex-m-rt has been really hard to get right, especially with __ebss etc. I've already forgotten all the detail around how this works, but I'd bet it's caused by the INSERT AFTER .bss. Can you try using the HAL's memory.x with that removed? I'm not sure why that's there to begin with.

mattico commented 2 years ago

I'm now also having issues when I don't remove INSERT AFTER .bss.

With cortex-m-rt v0.6.15 and INSERT AFTER .bss the link layout is:

  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .vector_table     PROGBITS        08000000 0001d4 000298 00   A  0   0  4
  [ 2] .text             PROGBITS        08000298 000470 026040 00  AX  0   0  8
  [ 3] .rodata           PROGBITS        080262d8 0264b0 002470 00  AM  0   0  4
  [ 4] .data             PROGBITS        20000000 028920 000358 00  WA  0   0  8
__sbss = 20000358
  [ 5] .bss              NOBITS          20000358 028c78 013770 00  WA  0   0  4
  [ 6] .axisram          NOBITS          24000000 028c78 080000 00  WA  0   0  8
  [ 7] .sram1            NOBITS          30000000 028c78 004d58 00  WA  0   0  4
  [ 8] .sram2            NOBITS          30020000 028c78 000064 00  WA  0   0  4
  [ 9] .sram3            NOBITS          30040000 028c78 000400 00  WA  0   0  4
  [10] .sram4            NOBITS          38000000 028c78 000000 00  WA  0   0  4
  [11] .bsram            NOBITS          38800000 028c78 000000 00  WA  0   0  4
__ebss = 20013ac8
  [12] .uninit           NOBITS          20013ac8 028c78 000698 00  WA  0   0  8

So because __ebss is inside the .bss section the memory sections aren't placed between __sbss and __ebss.

With cortex-m-rt v0.7.0 and INSERT AFTER .bss the link layout is:

  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .vector_table     PROGBITS        08000000 0001f4 000298 00   A  0   0  4
  [ 2] .text             PROGBITS        08000298 000490 02268c 00  AX  0   0  8
  [ 3] .rodata           PROGBITS        08022924 022b1c 001f68 00  AM  0   0  4
  [ 4] .data             PROGBITS        20000000 024a88 000358 00  WA  0   0  8
  [ 5] .gnu.sgstubs      PROGBITS        08024c00 024de0 000000 00  WA  0   0 32
__sbss = 20000358
  [ 6] .bss              NOBITS          20000358 024de0 013770 00  WA  0   0  4
  [ 7] .axisram          NOBITS          24000000 024de0 080000 00  WA  0   0  8
  [ 8] .sram1            NOBITS          30000000 024de0 004d58 00  WA  0   0  4
  [ 9] .sram2            NOBITS          30020000 024de0 000064 00  WA  0   0  4
  [10] .sram3            NOBITS          30040000 024de0 000400 00  WA  0   0  4
  [11] .sram4            NOBITS          38000000 024de0 000000 00  WA  0   0  4
  [12] .bsram            NOBITS          38800000 024de0 000000 00  WA  0   0  4
__ebss = 38800000
  [13] .uninit           NOBITS          20013ac8 024de0 000698 00  WA  0   0  8

Which obviously won't work.

Removing the INSERT AFTER .bss fixes __ebss. The section orders are a bit different but I don't think that effects anything. The program headers for the sram sections still say the type is LOAD and MemSiz>0 FileSiz=0, so flashing tools should still erase those regions of memory. Indeed llvm-size and probe-run include those sections in their size information, so I assume they zero them.

  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .axisram          NOBITS          24000000 0001d8 080000 00  WA  0   0  8
  [ 2] .sram1            NOBITS          30000000 0001d8 004d58 00  WA  0   0  4
  [ 3] .sram2            NOBITS          30020000 0001d8 000064 00  WA  0   0  4
  [ 4] .sram3            NOBITS          30040000 0001d8 000400 00  WA  0   0  4
  [ 5] .sram4            NOBITS          38000000 0001d8 000000 00  WA  0   0  4
  [ 6] .bsram            NOBITS          38800000 0001d8 000000 00  WA  0   0  4
  [ 7] .vector_table     PROGBITS        08000000 0001d8 000298 00   A  0   0  4
  [ 8] .text             PROGBITS        08000298 000470 02268c 00  AX  0   0  8
  [ 9] .rodata           PROGBITS        08022924 022afc 001f68 00  AM  0   0  4
  [10] .data             PROGBITS        20000000 024a68 000358 00  WA  0   0  8
  [11] .gnu.sgstubs      PROGBITS        08024c00 024dc0 000000 00  WA  0   0 32
__sbss = 20000358
 [12] .bss              NOBITS          20000358 024dc0 013770 00  WA  0   0  4
__ebss = 20013ac8
  [13] .uninit           NOBITS          20013ac8 024dc0 000698 00  WA  0   0  8

I think we should remove the INSERT AFTER .bss.

adamgreig commented 2 years ago

For more background see https://github.com/rust-embedded/cortex-m-rt/pull/287 and https://github.com/rust-embedded/cortex-m-rt/pull/323, but in short you're right, remove INSERT AFTER .bss for memory sections which are not contiguous with the main RAM section, as otherwise the BSS zeroing loop will try and zero memory between them and fail.