thesofproject / rimage

DSP firmware image creation and signing tool
Other
7 stars 62 forks source link

Error status 34304, crash in function: man_copy_sram #115

Closed pjdobrowolski closed 1 year ago

pjdobrowolski commented 1 year ago

Rimage is crashing during zephyr.ri signing

Stack trace:
Frame        Function    Args
000FFFFA1E0  00180062C27 (000FFFFA3E8, 00000000002, 00000000002, 000FFFFDE50)
00000000000  00180064CE5 (00000000064, 00000000000, 000000002D0, 00000000000)
000FFFFA8F0  00180134508 (00000000005, 00000000000, 0000080A250, 00000000000)
00000000041  0018012FC3B (00000000000, 00000000000, 00000000000, 000007DF790)
000FFFFACE0  00180130045 (0018026E458, 0018013636E, 000000000B7, 0018026E3DF)
000FFFFACE0  00180216FC8 (0010041A6A8, 0010041A6E8, 000000000B7, 0018026E3DF)
000FFFFACE0  001800434C7 (000FFFFB088, 6FFFFFD320B8, 0080006DC30, 455341420003E04C)
000FFFFADA0  001004055A5 (0080006DC30, 000FFFFB088, 6FFFFFD320B8, A00DA0000000000A)
000FFFFADF0  001004057E9 (0080006DC30, 000FFFFB088, 6FFFFFD320B8, 0000000000A)
000FFFFAE50  001004060C5 (000FFFFB088, 6FFFFFD320B8, 0018019335B, 000FFFFB088)
000FFFFAF00  00100406951 (6FFFFFD32010, 00000008000, 00000000008, 00800049E98)
000FFFFAF50  001004080AE (00800000001, 000FFFFCC4F, 00000000000, 00000000000)
000FFFFAFF0  0010040B9AF (000FFFFCAD0, 000FFFFCB30, 00800049B87, 00000000000)
000FFFFCD30  00180049B91 (00000000000, 00000000000, 00000000000, 00000000000)
000FFFFFFF0  00180047716 (00000000000, 00000000000, 00000000000, 00000000000)
000FFFFFFF0  001800477C4 (00000000000, 00000000000, 00000000000, 00000000000)
End of stack trace

Stack trace:
Frame        Function    Args
000FFFFA1E0  00180062C27 (000FFFFA3E8, 00000000002, 00000000002, 000FFFFDE50)
00000000000  00180064CE5 (00000000064, 00000000000, 000000002CC, 00000000000)
000FFFFA8F0  00180134508 (0000000003A, 00000000000, 000006CEC50, 00000000000)
00000000041  0018012FC3B (00000000000, 00000000000, 00000000000, 000006C7050)
000FFFFACE0  00180130045 (0018026E458, 0018013636E, 000000000B7, 0018026E3DF)
000FFFFACE0  00180216FC8 (0010041A6A8, 0010041A6E8, 000000000B7, 0018026E3DF)
000FFFFACE0  001800434C7 (000FFFFB088, 6FFFFFD320B8, 0080006DC30, 455341420003200C)
000FFFFADA0  001004055A5 (0080006DC30, 000FFFFB088, 6FFFFFD320B8, A00C20000000000A)
000FFFFADF0  001004057E9 (0080006DC30, 000FFFFB088, 6FFFFFD320B8, 0000000000A)
000FFFFAE50  001004060C5 (000FFFFB088, 6FFFFFD320B8, 0018019335B, 000FFFFB088)
000FFFFAF00  00100406951 (6FFFFFD32010, 00000008000, 00000000008, 00800049E98)
000FFFFAF50  001004080AE (00800000001, 000FFFFCC4F, 00000000000, 00000000000)
000FFFFAFF0  0010040B9AF (000FFFFCAD0, 000FFFFCB30, 00800049B87, 00000000000)
000FFFFCD30  00180049B91 (00000000000, 00000000000, 00000000000, 00000000000)
000FFFFFFF0  00180047716 (00000000000, 00000000000, 00000000000, 00000000000)
000FFFFFFF0  001800477C4 (00000000000, 00000000000, 00000000000, 00000000000)
End of stack trace

assertion "(uint64_t)offset + section->size <= image->adsp->image_size" failed: file "C:/Users/paweldox/zephyrproject/sof/rimage/src/manifest.c", line 183, function: man_copy_sram
      0 [main] rimage 1305 cygwin_exception::open_stackdumpfile: Dumping stack trace to rimage.exe.stackdump
FATAL ERROR: command exited with status 34304: C:/Users/paweldox/zephyrproject/build-rimage/rimage.exe -k C:/Users/paweldox/keys/mtl_private_key.pem -o 'C:\Users\paweldox\zephyrproject\build-mtl\zephyr\zephyr.ri' -c 'C:\Users\paweldox\zephyrproject\sof\rimage\config\mtl.toml' -e 'C:\Users\paweldox\zephyrproject\build-mtl\zephyr\boot.mod' 'C:\Users\paweldox\zephyrproject\build-mtl\zephyr\main.mod'

assertion "(uint64_t)offset + section->size <= image->adsp->image_size" failed: file "C:/Users/paweldox/zephyrproject/sof/rimage/src/manifest.c", line 183, function: man_copy_sram
      0 [main] rimage 691 cygwin_exception::open_stackdumpfile: Dumping stack trace to rimage.exe.stackdump
FATAL ERROR: command exited with status 34304: C:/Users/paweldox/zephyrproject/build-rimage/rimage.exe -k C:/Users/paweldox/keys/mtl_private_key.pem -o 'C:\Users\paweldox\zephyrproject\build-mtl\zephyr\zephyr.ri' -c 'C:\Users\paweldox\zephyrproject\sof\rimage\config\mtl.toml' -e 'C:\Users\paweldox\zephyrproject\build-mtl\zephyr\boot.mod' 'C:\Users\paweldox\zephyrproject\build-mtl\zephyr\main.mod'

rimage revision: 3ee717eebc6a2a512a0216363ae77473f94532c1 tomlc99 revision: e3a03f5ec7d8d33be705c5ce8a632d998ce9b4d1 zephyr revision: 2e66fac6d3ff66502689a03520737bda7527ef6c All taken from sof upstream west.yml

Repro rate is ca 50%, I'm using west sign on windows with msys2 and instaled

base 2020.12-1
bash 5.1.016-1
bash-completion 2.11-1
binutils 2.37-5
brotli 1.0.9-2
bsdtar 3.6.1-1
bzip2 1.0.8-3
ca-certificates 20210119-3
coreutils 8.32-2
curl 7.82.0-1
dash 0.5.11.5-1
db 5.3.28-3
file 5.41-4
filesystem 2022.01-4
findutils 4.9.0-2
gawk 5.1.0-2
gcc 10.2.0-1
gcc-libs 10.2.0-1
gdbm 1.22-3
getent 2.18.90-3
gettext 0.21-1
glib2 2.72.1-1
gmp 6.2.1-2
gnupg 2.2.32-2
grep 1~3.0-3
gzip 1.12-1
heimdal-libs 7.7.0-3
icu 70.1-1
inetutils 1.9.4-3
info 6.8-4
isl 0.22.1-1
jsoncpp 1.9.5-1
less 590-1
libarchive 3.6.1-1
libargp 20110921-3
libasprintf 0.21-1
libassuan 2.5.5-1
libbz2 1.0.8-3
libcrypt 2.1-3
libcurl 7.82.0-1
libdb 5.3.28-3
libedit 20210910_3.1-1
libexpat 2.4.8-1
libffi 3.3-1
libgcrypt 1.10.1-4
libgdbm 1.22-3
libgettextpo 0.21-1
libgnutls 3.7.4-2
libgpg-error 1.45-1
libgpgme 1.17.0-2
libhogweed 3.7.3-2
libiconv 1.16-2
libidn2 2.3.2-2
libintl 0.21-1
libksba 1.6.0-1
liblz4 1.9.3-1
liblzma 5.2.5-1
libnettle 3.7.3-2
libnghttp2 1.47.0-2
libnpth 1.6-1
libopenssl 1.1.1.n-1
libp11-kit 0.24.1-2
libpcre 8.45-1
libpcre2_8 10.37-1
libpsl 0.21.1-2
libreadline 8.1.002-1
librhash 1.4.2-1
libsqlite 3.38.2-1
libssh2 1.10.0-1
libtasn1 4.18.0-3
libunistring 0.9.10-1
libutil-linux 2.35.2-1
libuv 1.42.0-1
libxml2 2.9.13-1
libxslt 1.1.35-1
libzstd 1.5.2-1
mingw-w64-i686-bzip2 1.0.8-2
mingw-w64-i686-ca-certificates 20210119-1
mingw-w64-i686-expat 2.4.8-1
mingw-w64-i686-gcc-libs 11.2.0-10
mingw-w64-i686-gettext 0.21-3
mingw-w64-i686-gmp 6.2.1-3
mingw-w64-i686-libffi 3.3-4
mingw-w64-i686-libiconv 1.16-2
mingw-w64-i686-libsystre 1.0.1-4
mingw-w64-i686-libtasn1 4.18.0-1
mingw-w64-i686-libtre-git r128.6fb7206-2
mingw-w64-i686-libwinpthread-git 10.0.0.r0.gaa08f56da-1
mingw-w64-i686-mpc 1.2.1-1
mingw-w64-i686-mpdecimal 2.5.1-1
mingw-w64-i686-mpfr 4.1.0.p13-1
mingw-w64-i686-ncurses 6.3-3
mingw-w64-i686-openssl 1.1.1.n-1
mingw-w64-i686-p11-kit 0.24.1-2
mingw-w64-i686-python 3.9.11-2
mingw-w64-i686-python-greenlet 1.1.2-1
mingw-w64-i686-python-msgpack 1.0.3-1
mingw-w64-i686-python-pynvim 0.4.3-1
mingw-w64-i686-readline 8.1.001-1
mingw-w64-i686-sqlite3 3.38.2-1
mingw-w64-i686-tcl 8.6.11-5
mingw-w64-i686-termcap 1.3.1-6
mingw-w64-i686-tk 8.6.11.1-2
mingw-w64-i686-tzdata 2022a-1
mingw-w64-i686-xz 5.2.5-2
mingw-w64-i686-zlib 1.2.12-1
mintty 1~3.6.0-1
mpc 1.2.1-1
mpfr 4.1.0-1
msys2-keyring 1~20211228-1
msys2-launcher 1.4-1
msys2-runtime 3.3.4-2
msys2-runtime-devel 3.3.4-2
msys2-w32api-headers 9.0.0.6214.acc9b9d9e-1
msys2-w32api-runtime 9.0.0.6214.acc9b9d9e-1
nano 6.2-3
ncurses 6.3-1
nettle 3.7.3-2
openssl 1.1.1.n-1
openssl-devel 1.1.1.n-1
p11-kit 0.24.1-2
pacman 6.0.1-13
pacman-contrib 1.4.0-2
pacman-mirrors 20220205-1
perl 5.32.1-2
pinentry 1.2.0-1
pkg-config 0.29.2-4
rebase 4.5.0-1
sed 4.8-2
tcl 8.6.10-1
tftp-hpa 5.2-4
time 1.9-2
tzcode 2021e-1
util-linux 2.35.2-1
vim 8.2.3582-1
wget 1.21.3-1
which 2.21-3
windows-default-manifest 6.4-1
xz 5.2.5-1
zlib 1.2.12-1
zlib-devel 1.2.12-1
zstd 1.5.2-1
marc-hb commented 1 year ago

Please share your exact command line; there are many ways to invoke rimage from various wrapper scripts.

pjdobrowolski commented 1 year ago

west sign --build-dir $home/zephyrproject/build-mtl -t rimage --tool-path $home/zephyrproject/build-rimage/rimage.exe --tool-data $home/zephyrproject/sof/rimage/config -- -k $home/keys/mtl_private_key.pem

lgirdwood commented 1 year ago

Looks like we need to validate more of the toml input config per platform ?

lgirdwood commented 1 year ago

@pjdobrowolski will you be able to fix ?

aborisovich commented 1 year ago

Reproduced on Windows

To Reproduce

Configure Xtensa toolchain variables: set XTENSA_INSTALL_PATH=c:\usr\xtensa set ZEPHYR_TOOLCHAIN_VARIANT=xcc set XTENSA_CORE=ace10_LX7HiFi4_RI_2020_5 set XTENSA_TOOLS_VERSION=RI-2020.5-win32 set XTENSAD_LICENSE_FILE=84300@xtensa01p.elic.intel.com set XTENSA_TOOLS_DIR=%XTENSA_INSTALL_PATH%\XtDevTools\install\tools set XTENSA_BUILDS_DIR=%XTENSA_INSTALL_PATH%\XtDevTools\install\builds set XTENSA_TOOLCHAIN_PATH=%XTENSA_TOOLS_DIR%\%XTENSA_TOOLS_VERSION% set XTENSA_TOOLS=%XTENSA_TOOLS_DIR%\%XTENSA_TOOLS_VERSION%\XtensaTools set XTENSA_SYSTEM=%XTENSA_BUILDS_DIR%\%XTENSA_TOOLS_VERSION%\%XTENSA_CORE%\config Set generator to Ninja: set CMAKE_GENERATOR=Ninja Execute command python sof\scripts\xtensa-build-zephyr.py -u mtl -o sof\app\overlays\mtl\fpga_overlay.conf -k mtl_private_key.pem Reproduction Rate 50%? Look likes like environment reconfiguration helps (setting all variables from the start and deletion of all build cache).

Expected behavior Rimage.exe should produce properly signed zephyr.ri file (with more than 0 bytes). On failure like this one, zephyr.ri file should not be generated at all (error handling). Produced stackdump should contain readable content? Stack trace maybe?

Impact Annoyance.

Environment

Branch name and commit hash of the 2 repositories: sof (firmware/topology) and linux (kernel driver).

Name of the topology file

Name of the platform(s) on which the bug is observed.

Environment:

Screenshots or console output

image

Rimage stackdump (quite useless frankly): rimage.EXE.log

marc-hb commented 1 year ago

I tried hard to reproduce on Linux but I could not. I ran rimage with valgrind and it never found any memory corruption issue on my system.

assertion "(uint64_t)offset + section->size <= image->adsp->image_size" failed:

That's not a crash, that's a failed assert() catching bad inputs. It's not impossible but I'm surprised this assert fails only half the time. Do you see this assert() every time it "crashes" or is there any actual rimage crash some other times?

A long time ago we found some (actual) rimage crash when passing an invalid key file: thesofproject/sof#8680. This time I did not try an invalid key file, I only tried a couple valid, secret key files (but maybe not the same as yours)

Can you reproduce with other keys or only with this one secret key you use now? What happens with the default key?

rimage revision: https://github.com/thesofproject/rimage/commit/3ee717eebc6a2a512a0216363ae77473f94532c1 tomlc99 revision: e3a03f5ec7d8d33be705c5ce8a632d998ce9b4d1 zephyr revision: 2e66fac6d3ff66502689a03520737bda7527ef6c All taken from sof upstream west.yml

The sof version is the only one needed and it was the only one missing. All other versions are defined by sof/west.yml (and you should not use override sof/west.yml and use other versions when reporting a bug)

I tested both today's sof commit 8b79a6dc8e50 and also sof commit b553e529ee48 as reported later by @aborisovich but no repro.

Configure Xtensa toolchain variables:

When using scripts/xtensa-build-zephyr.py, XTENSA_TOOLS_ROOT should be the only variable needed. At least it is on my Linux system.

marc-hb commented 1 year ago

python sof\scripts\xtensa-build-zephyr.py -u mtl -o sof\app\overlays\mtl\fpga_overlay.conf -k mtl_private_key.pem

@pjdobrowolski are you using this overlay too? You only shared the west sign command, this obviously does not tell us what you built and how.

pjdobrowolski commented 1 year ago

Yes, I also use fpga overlay, however I don't use pythons scripts.

    #Paths settings for west build system
    XTENSA_INSTALL_PATH="c:/usr/xtensa"
    export ZEPHYR_TOOLCHAIN_VARIANT=xcc
    export XTENSA_CORE=ace10_LX7HiFi4_RI_2020_5
    export XTENSA_TOOLS_VERSION=RI-2020.5-win32
    export XTENSAD_LICENSE_FILE=84300@xtensa03p.elic.intel.com
    export XTENSA_TOOLCHAIN_PATH=$XTENSA_INSTALL_PATH/XtDevTools/install/tools/$XTENSA_TOOLS_VERSION
    export XTENSA_TOOLS=$XTENSA_TOOLS_DIR/$XTENSA_TOOLS_VERSION/XtensaTools
    export NINJA_PATH='C:/Program Files/ninja'
    export CMAKE_GENERATOR=Ninja

    west -v -v build --build-dir build-mtl -p always -b intel_adsp_ace15_mtpm ./sof/app -- -DOVERLAY_CONFIG=overlays/mtl/fpga_overlay.conf  

    #Paths settings for rimage build
    export OPENSSL_PATH="c:/msys64/var/lib/pacman/local/openssl-1.1.1.n-1"
    export CAT_PATH="c:/Program Files/Git/usr/bin"
    export NINJA_PATH="c:/Program/ Files/ninja"
    export MSYS_INSTALL_DIR="C:/msys64/usr/bin/"
    export CMAKE_GENERATOR=Ninja

    echo -e '\033[0;33m'
    echo '-----> cmake rimage build'
    echo -e '\033[0m'
    cmake -B $home/zephyrproject/build-rimage -S $home/zephyrproject/sof/rimage
    echo -e '\033[0;33m'
    echo '-----> cmake rimage --build'
    echo -e '\033[0m'
    cmake --build $home/zephyrproject/build-rimage

    echo -e '\033[0;33m'
    echo '-----> make backup build-mtl'
    echo -e '\033[0m'
    cp -r $home/zephyrproject/build-mtl/ $home/zephyrproject/build-mtl-backup
    echo -e '\033[0;33m'
    echo '-----> cmake west sign'
    echo -e '\033[0m'
    west sign --build-dir $home/zephyrproject/build-mtl -t rimage --tool-path $home/zephyrproject/build-rimage/rimage.exe --tool-data $home/zephyrproject/sof/rimage/config -- -k $home/keys/mtl_private_key.pem
    echo -e '\033[0;33m'
    echo '-----> smex *.ldc logs build'
    echo -e '\033[0m'
    $home/zephyrproject/build-mtl/zephyr/smex_ep/build/smex.exe -l $home/zephyrproject/build-mtl/zephyr/zephyr.ldc $home/zephyrproject/build-mtl/zephyr/zephyr.elf
    echo -e '\033[0;32m'
    date
    echo '-----> DONE <------'
    echo -e '\033[0m'
marc-hb commented 1 year ago

Yes, I also use fpga overlay, however I don't use pythons scripts.

The python wrapper script is not mandatory, using west directly is perfectly fine. However it is mandatory for reproducing and filing bugs because it makes sure we're all using the same configuration and testing (almost) the same thing. You already have python anyway (otherwise you couldn't compile Zephyr at all) so please try to reproduce with scripts/xtensa-build-zephyr.py too and share the command line that reproduces. Then you can go back to manual, verbose and complicated configuration if you need to or want to.

marc-hb commented 1 year ago

Can you reproduce with other keys or only with this one secret key you use now? What happens with the default key?

Ping? (and some other questions too)

marc-hb commented 1 year ago

export OPENSSL_PATH="c:/msys64/var/lib/pacman/local/openssl-1.1.1.n-1"

@juimonen is it possible to sign MTL with legacy openssl 1 ?

@pjdobrowolski, @aborisovich could you try to uninstall openssl 1 and install openssl 3 instead? Does it still reproduce?

Some rimage+openssl background information:

juimonen commented 1 year ago

@marc-hb dont recall very well... I think mtl was signed before with openssl1 and I did the "upgrade" as the openssl3 was getting into our CI machines with new ubuntus. rimage is getting the openssl ver from the system when building, so your system needs to have correctly installed openssl. I've been fiddling the openssl version with my own git version of it and just brutally sudo installing it in my machine... maybe not very recommended way. Just found out it was not super easy to switch between the versions with apt or dnf...

pjdobrowolski commented 1 year ago

@aborisovich is using pythons for building so I didn't think it is necessary to redo his in pythons on my machine, do I? Before it appears that it is some rimage issue, I've switched between keys and issue was the same. Although problems might be with Zephyr? Because I didn't have such errors week ago.

aborisovich commented 1 year ago

@aborisovich is using pythons for building so I didn't think it is necessary to redo his in pythons on my machine, do I? Before it appears that it is some rimage issue, I've switched between keys and issue was the same. Although problems might be with Zephyr? Because I didn't have such errors week ago.

There were changes to rimage recently. I think we could try to bisect some commits from rimage repo to find the cause.

lgirdwood commented 1 year ago

Does valgrind work on Windows ? If so, can this be tried ?

marc-hb commented 1 year ago

@aborisovich is using pythons [scripts/xtensa-build-zephyr.py] for building so I didn't think it is necessary to redo his in pythons on my machine, do I?

You're technically correct but we're struggling to understand the specific conditions required to reproduce so more datapoints is always useful. Also, the reluctance to align on the same configuration is concerning: what if @aborisovich is not available to help you file your next bug? (about rimage or not). The only side effects of scripts/xtensa-build-zephyr.py are running west commands very similar (but maybe not 100% identical) to your west commands, so what are you afraid of?

pjdobrowolski commented 1 year ago

Bug is not reproducing after @softwarecki clean up.

marc-hb commented 1 year ago

Just for the record:

Originally posted by @aborisovich in https://github.com/thesofproject/sof/issues/7414#issuecomment-1511093485

          > You mean this crash?

Not this one, we have other sporadicalls not reported yet.