apache / nuttx

Apache NuttX is a mature, real-time embedded operating system (RTOS)
https://nuttx.apache.org/
Apache License 2.0
2.81k stars 1.16k forks source link

[BUG] ELF loader on ESP32-S3 broken after #14100 #14487

Open tmedicci opened 1 week ago

tmedicci commented 1 week ago

Description / Steps to reproduce the issue

After #14100 was merged, the ELF loader on ESP32-S3 is broken:

Build steps:

make -j distclean && ./tools/configure.sh esp32s3-devkit:elf && make flash ESPTOOL_PORT=/dev/ttyUSB0 -s -j$(nproc) && minicom -D /dev/ttyUSB0

And, then, run elf example:

nsh> elf
Initial memory usage: 40796
elf_main: Registering romdisk at /dev/ram0
elf_main: Mounting ROMFS filesystem at target=/mnt/elf/romfs with source=/dev/ram0
testheader: 
****************************************************************************
* Executing errno
****************************************************************************

xtensa_user_panic: User Exception: EXCCAUSE=0003 task: elf
dump_assert_info: Current Version: NuttX  10.4.0 b503b323ce Oct 23 2024 15:40:55 xtensa
dump_assert_info: Assertion failed user panic: at file: common/xtensa_assert.c:180 task: elf process: elf 0x42047d94
up_dump_register:    PC: 40056fa1    PS: 00060730
up_dump_register:    A0: 8203fd9e    A1: 3fc96f50    A2: 40387d50    A3: 3c02135e
up_dump_register:    A4: 000000ab    A5: 40387df8    A6: 00001d00    A7: 08e0ffe0
up_dump_register:    A8: 00000000    A9: 00019c00   A10: 00000000   A11: 3fc96f10
up_dump_register:   A12: 00060520   A13: 00060520   A14: 00000040   A15: 00000000
up_dump_register:   SAR: 00000020 CAUSE: 00000003 VADDR: 40387df8
up_dump_register:  LBEG: 40056f5c  LEND: 40056f72  LCNT: 00000000
dump_stackinfo: User Stack:
dump_stackinfo:   base: 0x3fc96a90
dump_stackinfo:   size: 00002000
dump_stackinfo:     sp: 0x3fc96f50
stack_dump: 0x3fc96f30: 00000000 00000000 00000000 00000000 8203e7ea 3fc96f60 00000000 40387d50
stack_dump: 0x3fc96f50: 8203e80e 3fc96f90 fffffff7 40387d50 00000000 3fc972ac 000000ab 00000024
stack_dump: 0x3fc96f70: 000000ab 3fc972a0 3fc969a0 3fc97848 8203e82c 3fc96fb0 00000000 40387d50
stack_dump: 0x3fc96f90: 000000ab fffffffc 3fc972ac 3fc972ac 82045b35 3fc96fe0 00000003 40387d50
stack_dump: 0x3fc96fb0: 3fc969a0 3fc96fe0 3fc97060 3fc97870 000000ab 00000008 3fc97848 00000000
stack_dump: 0x3fc96fd0: 820459f0 3fc97000 3fc97060 40387d50 000000ab 00000008 00000000 3fc96648
stack_dump: 0x3fc96ff0: 8204256a 3fc97020 3fc97060 40387d50 000000ab 40387d50 000000ab 00000009
stack_dump: 0x3fc97010: 820481ba 3fc97060 3fc977c0 00000000 3fc97e00 40387d50 00000000 00000000
stack_dump: 0x3fc97030: 00000000 00000000 00000200 00000015 00000001 3fc97024 3fc97898 00000001
stack_dump: 0x3fc97050: 82048062 3fc97120 3fc977c0 3fc8b08c 40387d50 3fc97e00 000000ac 00000094
stack_dump: 0x3fc97070: 00000004 00000001 00002850 00000000 00000000 0000816d 464c457f 00010101
stack_dump: 0x3fc97090: 00000000 00000000 005e0001 00000001 00000030 00000000 00002378 00000300
stack_dump: 0x3fc970b0: 00000034 00280000 001e001f 00000000 3fc97870 00000000 00000000 00000000
stack_dump: 0x3fc970d0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
stack_dump: 0x3fc970f0: 00000003 00000000 2a2a0a0a 2a2a2a2a 3c029770 00000021 00000000 00000000
stack_dump: 0x3fc97110: 8204811a 3fc97150 3fc8b08c 3fc977c0 00000064 3fc97150 3fc8b08c 3fc971f0
stack_dump: 0x3fc97130: 3c029770 00000021 3fc8c7c4 3fc972ac 82047ece 3fc97190 3fc8b08c 3fc971f0
stack_dump: 0x3fc97150: 00000014 420357a8 42035764 42059b50 3fc971f0 00000000 00000000 00000021
stack_dump: 0x3fc97170: fffffff4 3c029770 00000000 00000000 820347d1 3fc971c0 3fc8b08c 3fc8c800
stack_dump: 0x3fc97190: 00000000 00000000 0000a5f0 00000110 00000000 3c029770 00000021 3c007643
stack_dump: 0x3fc971b0: 82033000 3fc97220 42047d94 00000001 7665642f 6d61722f 00000030 00057bc8
stack_dump: 0x3fc971d0: 00009f5c 00057bc8 0000a0c0 00000000 00000000 00000110 00000200 3c007690
stack_dump: 0x3fc971f0: 3c0296a4 00000000 00000000 00000000 3c00763a 3c007520 00060622 00000000
stack_dump: 0x3fc97210: 00000000 3fc97240 00000000 42047d94 3fc96a80 3fc89b38 00000000 3fc89b38
stack_dump: 0x3fc97230: 00000000 3fc97260 00000000 00000000 00000000 00000000 00000000 00000000
stack_dump: 0x3fc97250: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
dump_tasks:    PID GROUP PRI POLICY   TYPE    NPX STATE   EVENT      SIGMASK          STACKBASE  STACKSIZE      USED   FILLED    COMMAND
dump_tasks:   ----   --- --- -------- ------- --- ------- ---------- ---------------- 0x3fc8bd90      2048      1040    50.7%    irq
dump_task:       0     0   0 FIFO     Kthread -   Ready              0000000000000000 0x3fc8b1b0      3040       672    22.1%    Idle_Task
dump_task:       1     0 224 RR       Kthread -   Waiting Semaphore  0000000000000000 0x3fc8d6e8      1960       608    31.0%    hpwork 0x3fc8c5cc 0x3f0
dump_task:       2     0 100 RR       Kthread -   Waiting Semaphore  0000000000000000 0x3fc8dfb8      1960       608    31.0%    lpwork 0x3fc8c594 0x3f8
dump_task:       3     3 100 RR       Task    -   Waiting Semaphore  0000000000000000 0x3fc8ebc8     30408      1872     6.1%    nsh_main
dump_task:       4     4 100 RR       Task    -   Running            0000000000000000 0x3fc96a90      2000      1296    64.8%    elf

On which OS does this issue occur?

[OS: Linux]

What is the version of your OS?

Manjaro, Ubuntu

NuttX Version

master

Issue Architecture

[Arch: xtensa]

Issue Area

[Area: Kernel]

Verification

acassis commented 1 week ago

The sim:elf is not crashing, but I noticed that about 1KB of RAM is not returned to the system, it could be related to the memory used by the RAMDISK.

nsh> free
      total       used       free    maxused    maxfree  nused  nfree name
   67108856    1258968   65849888    1259648   65849888     32      1 Umem

nsh> elf
Initial memory usage: 1326760
Registering romdisk at /dev/ram6
...

nsh> free
      total       used       free    maxused    maxfree  nused  nfree name
   67108856        712   67108144       4912   67108144      2      1 textheap
   67108856    1260616   65848240    1534816   65710784     51      4 Umem
anjiahao1 commented 1 week ago

Could you help me upload the elf firmware of esp32s3 that you compiled? Thank you @tmedicci

tmedicci commented 1 week ago
make -j distclean && ./tools/configure.sh esp32s3-devkit:elf && make flash ESPTOOL_PORT=/dev/ttyUSB0 -s -j$(nproc) && minicom -D /dev/ttyUSB0

Of course:

nuttx.zip

xiaoxiang781216 commented 2 days ago

@anjiahao1 what's the result?

anjiahao1 commented 1 day ago

@tmedicci sorry,can you open DEBUG_SYMBOLS this option, and make -j20 flash it then run elf. get crash log for me, I will analyze this problem. I can get the esp32s3 hardware board on Friday at the earliest.

tmedicci commented 1 day ago

Hi @anjiahao1 ,

The log with DEBUG_SYMBOLS is pretty much the same:

ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x1 (POWERON),boot:0x8 (SPI_FAST_FLASH_BOOT)
SPIWP:0xee
mode:DIO, clock div:2
load:0x3fc8bc90,len:0x1574
load:0x40374000,len:0x57fc
SHA-256 comparison failed:
Calculated: 8a375cf03f84198ee7042315f9b32d4c266cf64547ca5938b694392cc0722be5
Expected: 0000000050920000000000000000000000000000000000000000000000000000
Attempting to boot anyway...
entry 0x40374f14
*** Booting NuttX ***
I (53) boot: chip revision: v0.1
I (54) boot.esp32s3: Boot SPI Speed : 40MHz
I (54) boot.esp32s3: SPI Mode       : DIO
I (57) boot.esp32s3: SPI Flash Size : 8MB
I (62) boot: Enabling RNG early entropy source...
dram: lma 0x00000020 vma 0x3fc8bc90 len 0x1574   (5492)
iram: lma 0x0000159c vma 0x40374000 len 0x57fc   (22524)
padd: lma 0x00006da8 vma 0x00000000 len 0x9250   (37456)
imap: lma 0x00010000 vma 0x42030000 len 0x2ab50  (174928)
padd: lma 0x0003ab58 vma 0x00000000 len 0x54c0   (21696)
dmap: lma 0x00040020 vma 0x3c000020 len 0x2edac  (191916)
total segments stored 6
AB
NuttShell (NSH) NuttX-10.4.0
nsh> elf
Initial memory usage: 40796
elf_main: Registering romdisk at /dev/ram0
elf_main: Mounting ROMFS filesystem at target=/mnt/elf/romfs with source=/dev/ram0
testheader: 
****************************************************************************
* Executing errno
****************************************************************************

xtensa_user_panic: User Exception: EXCCAUSE=0003 task: elf
dump_assert_info: Current Version: NuttX  10.4.0 b503b323ce Oct 30 2024 09:42:09 xtensa
dump_assert_info: Assertion failed user panic: at file: common/xtensa_assert.c:180 task: elf process: elf 0x42047c20
up_dump_register:    PC: 40056fa1    PS: 00060730
up_dump_register:    A0: 8203fcae    A1: 3fc96e50    A2: 40387c50    A3: 3c0212de
up_dump_register:    A4: 000000ab    A5: 40387cf8    A6: 00001d00    A7: 08e0ffe0
up_dump_register:    A8: 00000000    A9: 00019c00   A10: 00000000   A11: 3fc96e10
up_dump_register:   A12: 00060520   A13: 00060520   A14: 00000040   A15: 00000000
up_dump_register:   SAR: 00000020 CAUSE: 00000003 VADDR: 40387cf8
up_dump_register:  LBEG: 40056f5c  LEND: 40056f72  LCNT: 00000000
dump_stackinfo: User Stack:
dump_stackinfo:   base: 0x3fc96990
dump_stackinfo:   size: 00002000
dump_stackinfo:     sp: 0x3fc96e50
stack_dump: 0x3fc96e30: 00000000 00000000 00000000 00000000 8203e6fa 3fc96e60 00000000 40387c50
stack_dump: 0x3fc96e50: 8203e71e 3fc96e90 fffffff7 40387c50 00000000 3fc971ac 000000ab 00000024
stack_dump: 0x3fc96e70: 000000ab 3fc971a0 3fc968a0 3fc97748 8203e73c 3fc96eb0 00000000 40387c50
stack_dump: 0x3fc96e90: 000000ab fffffffc 3fc971ac 3fc971ac 82045a31 3fc96ee0 00000003 40387c50
stack_dump: 0x3fc96eb0: 3fc968a0 3fc96ee0 3fc96f60 3fc97770 000000ab 00000008 3fc97748 00000000
stack_dump: 0x3fc96ed0: 820458ec 3fc96f00 3fc96f60 40387c50 000000ab 00000008 00000000 3fc96548
stack_dump: 0x3fc96ef0: 82042466 3fc96f20 3fc96f60 40387c50 000000ab 40387c50 000000ab 00000009
stack_dump: 0x3fc96f10: 82048046 3fc96f60 3fc976c0 00000000 3fc97d00 40387c50 00000000 00000000
stack_dump: 0x3fc96f30: 00000000 00000000 00000200 00000015 00000001 3fc96f24 3fc97798 00000001
stack_dump: 0x3fc96f50: 82047eee 3fc97020 3fc976c0 3fc8af8c 40387c50 3fc97d00 000000ac 00000094
stack_dump: 0x3fc96f70: 00000004 00000001 00002850 00000000 00000000 0000816d 464c457f 00010101
stack_dump: 0x3fc96f90: 00000000 00000000 005e0001 00000001 00000030 00000000 00002378 00000300
stack_dump: 0x3fc96fb0: 00000034 00280000 001e001f 00000000 3fc97770 00000000 00000000 00000000
stack_dump: 0x3fc96fd0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
stack_dump: 0x3fc96ff0: 00000003 00000000 2a2a0a0a 2a2a2a2a 3c0296f0 00000021 00000000 00000000
stack_dump: 0x3fc97010: 82047fa6 3fc97050 3fc8af8c 3fc976c0 00000064 3fc97050 3fc8af8c 3fc970f0
stack_dump: 0x3fc97030: 3c0296f0 00000021 3fc8c6c4 3fc971ac 82047d5a 3fc97090 3fc8af8c 3fc970f0
stack_dump: 0x3fc97050: 00000014 420357a0 4203575c 420599dc 3fc970f0 00000000 00000000 00000021
stack_dump: 0x3fc97070: fffffff4 3c0296f0 00000000 00000000 820347c9 3fc970c0 3fc8af8c 3fc8c700
stack_dump: 0x3fc97090: 00000000 00000000 0000a5f0 00000110 00000000 3c0296f0 00000021 3c0075c3
stack_dump: 0x3fc970b0: 82032ff8 3fc97120 42047c20 00000001 7665642f 6d61722f 00000030 00057cc8
stack_dump: 0x3fc970d0: 00009f5c 00057cc8 0000a0c0 00000000 00000000 00000110 00000200 3c007610
stack_dump: 0x3fc970f0: 3c029624 00000000 00000000 00000000 3c0075ba 3c0074a0 00060622 00000000
stack_dump: 0x3fc97110: 00000000 3fc97140 00000000 42047c20 3fc96980 3fc89a38 00000000 3fc89a38
stack_dump: 0x3fc97130: 00000000 3fc97160 00000000 00000000 00000000 00000000 00000000 00000000
stack_dump: 0x3fc97150: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
dump_tasks:    PID GROUP PRI POLICY   TYPE    NPX STATE   EVENT      SIGMASK          STACKBASE  STACKSIZE   COMMAND
dump_tasks:   ----   --- --- -------- ------- --- ------- ---------- ---------------- 0x3fc8bc90      2048   irq
dump_task:       0     0   0 FIFO     Kthread -   Ready              0000000000000000 0x3fc8b0b0      3040   Idle_Task
dump_task:       1     0 224 RR       Kthread -   Waiting Semaphore  0000000000000000 0x3fc8d5e8      1960   hpwork 0x3fc8c4cc 0x3fc8c4f0
dump_task:       2     0 100 RR       Kthread -   Waiting Semaphore  0000000000000000 0x3fc8deb8      1960   lpwork 0x3fc8c494 0x3fc8c4b8
dump_task:       3     3 100 RR       Task    -   Waiting Semaphore  0000000000000000 0x3fc8eac8     30408   nsh_main
dump_task:       4     4 100 RR       Task    -   Running            0000000000000000 0x3fc96990      2000   elf

Please, find the firmware with it enabled: nuttx.zip

Thanks!

anjiahao1 commented 1 day ago

hello @tmedicci I used gdbserver. py to analyze and found that the files in the backtrace of the crash are inconsistent with the main thread? Are you using the latest code? The reason for the crash should be that the buffer was copied incorrectly during memcpy

step save

up_dump_register:    PC: 40056fa1    PS: 00060730
up_dump_register:    A0: 8203fcae    A1: 3fc96e50    A2: 40387c50    A3: 3c0212de
up_dump_register:    A4: 000000ab    A5: 40387cf8    A6: 00001d00    A7: 08e0ffe0
up_dump_register:    A8: 00000000    A9: 00019c00   A10: 00000000   A11: 3fc96e10
up_dump_register:   A12: 00060520   A13: 00060520   A14: 00000040   A15: 00000000
up_dump_register:   SAR: 00000020 CAUSE: 00000003 VADDR: 40387cf8
up_dump_register:  LBEG: 40056f5c  LEND: 40056f72  LCNT: 00000000
dump_stackinfo: User Stack:
dump_stackinfo:   base: 0x3fc96990
dump_stackinfo:   size: 00002000
dump_stackinfo:     sp: 0x3fc96e50
stack_dump: 0x3fc96e30: 00000000 00000000 00000000 00000000 8203e6fa 3fc96e60 00000000 40387c50
stack_dump: 0x3fc96e50: 8203e71e 3fc96e90 fffffff7 40387c50 00000000 3fc971ac 000000ab 00000024
stack_dump: 0x3fc96e70: 000000ab 3fc971a0 3fc968a0 3fc97748 8203e73c 3fc96eb0 00000000 40387c50
stack_dump: 0x3fc96e90: 000000ab fffffffc 3fc971ac 3fc971ac 82045a31 3fc96ee0 00000003 40387c50
stack_dump: 0x3fc96eb0: 3fc968a0 3fc96ee0 3fc96f60 3fc97770 000000ab 00000008 3fc97748 00000000
stack_dump: 0x3fc96ed0: 820458ec 3fc96f00 3fc96f60 40387c50 000000ab 00000008 00000000 3fc96548
stack_dump: 0x3fc96ef0: 82042466 3fc96f20 3fc96f60 40387c50 000000ab 40387c50 000000ab 00000009
stack_dump: 0x3fc96f10: 82048046 3fc96f60 3fc976c0 00000000 3fc97d00 40387c50 00000000 00000000
stack_dump: 0x3fc96f30: 00000000 00000000 00000200 00000015 00000001 3fc96f24 3fc97798 00000001
stack_dump: 0x3fc96f50: 82047eee 3fc97020 3fc976c0 3fc8af8c 40387c50 3fc97d00 000000ac 00000094
stack_dump: 0x3fc96f70: 00000004 00000001 00002850 00000000 00000000 0000816d 464c457f 00010101
stack_dump: 0x3fc96f90: 00000000 00000000 005e0001 00000001 00000030 00000000 00002378 00000300
stack_dump: 0x3fc96fb0: 00000034 00280000 001e001f 00000000 3fc97770 00000000 00000000 00000000
stack_dump: 0x3fc96fd0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
stack_dump: 0x3fc96ff0: 00000003 00000000 2a2a0a0a 2a2a2a2a 3c0296f0 00000021 00000000 00000000
stack_dump: 0x3fc97010: 82047fa6 3fc97050 3fc8af8c 3fc976c0 00000064 3fc97050 3fc8af8c 3fc970f0
stack_dump: 0x3fc97030: 3c0296f0 00000021 3fc8c6c4 3fc971ac 82047d5a 3fc97090 3fc8af8c 3fc970f0
stack_dump: 0x3fc97050: 00000014 420357a0 4203575c 420599dc 3fc970f0 00000000 00000000 00000021
stack_dump: 0x3fc97070: fffffff4 3c0296f0 00000000 00000000 820347c9 3fc970c0 3fc8af8c 3fc8c700
stack_dump: 0x3fc97090: 00000000 00000000 0000a5f0 00000110 00000000 3c0296f0 00000021 3c0075c3
stack_dump: 0x3fc970b0: 82032ff8 3fc97120 42047c20 00000001 7665642f 6d61722f 00000030 00057cc8
stack_dump: 0x3fc970d0: 00009f5c 00057cc8 0000a0c0 00000000 00000000 00000110 00000200 3c007610
stack_dump: 0x3fc970f0: 3c029624 00000000 00000000 00000000 3c0075ba 3c0074a0 00060622 00000000
stack_dump: 0x3fc97110: 00000000 3fc97140 00000000 42047c20 3fc96980 3fc89a38 00000000 3fc89a38
stack_dump: 0x3fc97130: 00000000 3fc97160 00000000 00000000 00000000 00000000 00000000 00000000
stack_dump: 0x3fc97150: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

as log.txt

and use

/tools/gdbserver.py -a esp32s3 -e nuttx -l log.txt -p 1235 xtensa-esp32s3-elf-gdb nuttx -ex "target remote :1235"

get this: image

crash in memcpy image

tmedicci commented 1 day ago

I used gdbserver. py to analyze and found that the files in the backtrace of the crash are inconsistent with the main thread? Are you using the latest code?

No, I'm using the version just after those commits were added (not the newest master, although our CI shows it keeps crashing).

The reason for the crash should be that the buffer was copied incorrectly during memcpy

Any idea on why the copy failed after adding these commits?

anjiahao1 commented 1 day ago

I used gdbserver. py to analyze and found that the files in the backtrace of the crash are inconsistent with the main thread? Are you using the latest code?

No, I'm using the version just after those commits were added (not the newest master, although our CI shows it keeps crashing).

The reason for the crash should be that the buffer was copied incorrectly during memcpy

Any idea on why the copy failed after adding these commits?

OK, I will fix this problem as soon as I get the esp32s3 hardware tomorrow.