thesofproject / rimage

DSP firmware image creation and signing tool
Other
7 stars 62 forks source link

[BUG] Signature issue on TGL UP Extreme i11 - openssl3 #99

Closed plbossart closed 2 years ago

plbossart commented 2 years ago

Describe the bug

The firmware generated locally on my NUC does not boot on UpExtreme 11. The daily-build firmware and the 2.1 version boot fine, so it's not a hardware/CSME issue, rather an undocumented or missing environment for the signature to work.

Standard Ubuntu 22.04 on my side. Same scripts work fine on Up Extreme (WHL)

To Reproduce Steps to reproduce the behavior: (e.g. list commands or actions used to reproduce the bug)

scripts/xtensa-build-all.sh -d tgl (debug version) scripts/xtensa-build-all.sh tgl

Reproduction Rate

100%

Expected behavior

no errors on boot

Impact

showstopper

Environment 1) Branch name and commit hash of the 2 repositories: sof (firmware/topology) and linux (kernel driver).

2) Name of the topology file

Screenshots or console output

[    7.008951] snd_sof_intel_hda_common:hda_cl_copy_fw: sof-audio-pci-intel-tgl 0000:00:1f.3: FW Poll Status: reg[0x80000]=0x80000012 timedout
[    7.008957] sof-audio-pci-intel-tgl 0000:00:1f.3: hda_cl_copy_fw: timeout with rom_status_reg (0x80000) read
[    7.008969] snd_sof_intel_hda_common:hda_dsp_stream_trigger: sof-audio-pci-intel-tgl 0000:00:1f.3: FW Poll Status: reg[0x160]=0x140000 successful
[    7.008973] sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump start ]------------
[    7.008974] sof-audio-pci-intel-tgl 0000:00:1f.3: Firmware download failed
[    7.008975] sof-audio-pci-intel-tgl 0000:00:1f.3: fw_state: SOF_FW_BOOT_IN_PROGRESS (2)
[    7.008978] snd_sof_intel_hda_common:hda_dsp_get_status: sof-audio-pci-intel-tgl 0000:00:1f.3: unknown ROM status value 80000012
[    7.008991] sof-audio-pci-intel-tgl 0000:00:1f.3: extended rom status:  0x80000012 0x2c 0x0 0x0 0x0 0x0 0x2510113 0x0
[    7.008992] sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump end ]------------
[    7.009016] sof-audio-pci-intel-tgl 0000:00:1f.3: Failed to start DSP
[    7.009017] sof-audio-pci-intel-tgl 0000:00:1f.3: error: failed to boot DSP firmware -110
[    7.009020] snd_sof:sof_set_fw_state: sof-audio-pci-intel-tgl 0000:00:1f.3: fw_state change: 2 -> 3
[    7.061361] snd_sof_intel_hda_common:hda_dsp_core_reset_enter: sof-audio-pci-intel-tgl 0000:00:1f.3: FW Poll Status: reg[0x4]=0x1d003c timedout
[    7.061368] sof-audio-pci-intel-tgl 0000:00:1f.3: error: hda_dsp_core_reset_enter: timeout on HDA_DSP_REG_ADSPCS read
[    7.061370] sof-audio-pci-intel-tgl 0000:00:1f.3: error: dsp core reset failed: core_mask 1
[    7.062148] snd_sof:sof_set_fw_state: sof-audio-pci-intel-tgl 0000:00:1f.3: fw_state change: 3 -> 0
[    7.062173] sof-audio-pci-intel-tgl 0000:00:1f.3: error: sof_probe_work failed err: -110
lgirdwood commented 2 years ago

@plbossart can you attach your FW. Would be nice to diff it's headers against v2.1. Btw, I assume you are signing with community key.? Do both v2.1 community and Intel keys work for you ?

lgirdwood commented 2 years ago

One more thing, is your rimage submodule up to date ? (and in $PATH)

plbossart commented 2 years ago

git log --oneline b4886bebb (HEAD -> main, origin/main) module_adapter: add zephyr logging support

cd rimage/ git log --oneline 9d45332 (HEAD) Write firmware file micro version to manifest for cAVS platforms

rimage is not in $PATH. It was not needed before and must be set by scripts.

plbossart commented 2 years ago

compilation log log.txt

firmware sof-tgl.ri.gz

marc-hb commented 2 years ago

This reminds me this bug fix: 95d887251ee40397300

fredoh9 commented 2 years ago

Interesting, I compared with mine

  1. SHA1 => same
  2. build log => not so much different
  3. fw binaries => a few bytes are different after "--erase_vars" but not sure that is crucial (thanks @marc-hb )

./sof_ri_info.py ~/Downloads/sof-tgl.ri --erase_vars ~/Downloads/sof-tgl.ri_no_vars

marc-hb commented 2 years ago

fw binaries => a few bytes are different after "--erase_vars" but not sure that is crucial (thanks @marc-hb )

I implemented --erase-vars so the image after --erase_vars is 100% identical when using the same toolchain (which you obviously did, otherwise the differences would be much bigger). So I do find these few bytes difference worrying, can you please share the hexdiff?

Also, please share the manifest differences as shown by diff -u <(sof/tools/sof_ri_info.py image1.ri) <(sof/tools/sof_ri_info.py image2.ri). Some randomness is expected because rimage uses a salt (that's why --erase_vars exists), other differences are not.

BTW --erase-vars is used in CI to test every PR catch any __TIMESTAMP__ (in addition to checkpatch)

fredoh9 commented 2 years ago

Don't know the root cause, but from the build log, Source content hash is different.

This is mine,

-- GIT_TAG / GIT_LOG_HASH : v2.0-rc1-997-gb4886bebbe49 / b4886bebbe49
-- Source content hash: 1c511e77. Note: by design, source hash is broken by config changes. See thesofproject/sof#3890.

This is Pierre's,

-- GIT_TAG / GIT_LOG_HASH : v2.0-rc1-997-gb4886bebbe49 / b4886bebb
-- Source content hash: 67531a8c. Note: by design, source hash is broken by config changes. See thesofproject/sof#3890.
plbossart commented 2 years ago

@keqiaozhang reported a similar issue worked-around with the "scripts/xtensa-build-all.sh -d tgl" option. In my case the debug option doesn't solve anything.

marc-hb commented 2 years ago

Don't know the root cause, but from the build log, Source content hash is different.

This is the very first difference and the one we must focus on first. All other differences could be impacted by this. The source hash is not affected by any of these:

So there is really, absolutely no reason for @plbossart's source hash to be different. I have the same 1c511e77 source hash as @fredoh9 with any toolchain.

fredoh9 commented 2 years ago

@keqiaozhang reported a similar issue worked-around with the "scripts/xtensa-build-all.sh -d tgl" option. In my case the debug option doesn't solve anything.

if -d make difference, this is more serious bug. Good thing is it worked for both for me, it didn't work for both for Pierre. I like the results, consistency, at least.

marc-hb commented 2 years ago

@plbossart can you please first make sure that git status --ignored is clean (sorry for asking this but we're grasping at straws now), then run the following commands and report which ones don't match

wc build_tgl_?cc/source_hash/*

  1408   1408  57728 build_tgl_gcc/source_hash/tracked_file_hash_list
  1408   1408  61688 build_tgl_gcc/source_hash/tracked_file_list
  2816   2816 119416 total

md5sum build_tgl_?cc/source_hash/*

0c24b8d8a641391d053cba84ec86f20f  build_tgl_gcc/source_hash/tracked_file_hash_list
72028df5e905505089a10e92e4d94781  build_tgl_gcc/source_hash/tracked_file_list
git ls-files src/ scripts/ | md5sum

72028df5e905505089a10e92e4d94781  -
git hash-object src/probe/probe.c

265c6fe9fea227847ba9094acd663e0a421f565a
fredoh9 commented 2 years ago

@marc-hb, I built without -d, I have 100% same with yours above

plbossart commented 2 years ago

@marc-hb I removed all ignored files and rebuilt, same SHA1

-- GIT_TAG / GIT_LOG_HASH : v2.0-rc1-997-gb4886bebbe49 / b4886bebb
-- Source content hash: 67531a8c. Note: by design, source hash is broken by config changes. See thesofproject/sof#3890.
wc build_tgl_?cc/source_hash/*
  1408   1408  57728 build_tgl_xcc/source_hash/tracked_file_hash_list << DIFFERENT, I have xcc only?
  1408   1408  61688 build_tgl_xcc/source_hash/tracked_file_list
  2816   2816 119416 total

md5sum build_tgl_?cc/source_hash/*
0498f2ce4d4636c3b5ded04c22bb04f2  build_tgl_xcc/source_hash/tracked_file_hash_list <<< DIFFERENT, xcc only?
72028df5e905505089a10e92e4d94781  build_tgl_xcc/source_hash/tracked_file_list

git ls-files src/ scripts/ | md5sum
72028df5e905505089a10e92e4d94781  - << SAME

git hash-object src/probe/probe.c 
265c6fe9fea227847ba9094acd663e0a421f565a << SAME
plbossart commented 2 years ago

FWIW, I cloned a clean sof and same results, this difference in source hash is not due to my local setup or files that might have side effects.

marc-hb commented 2 years ago

Fascinating, so you have the same list of files and same git commit but git hash-object is different for some source files. Let's find which files have a different hash.

First, please run git rev-parse --show-object-format. If it says sha256 then tell us and stop reading (seems unlikely considering the git version is the same)

If it says sha1 then please run paste build_tgl_?cc/source_hash/* and diff -u the output with mine (attached) 5917_source_hashes.txt

Roughly how many files have a different hash? If a small number then which ones?

I have xcc only?

None of this depends on the toolchain, it's pure source and git.

plbossart commented 2 years ago
diff -u ~/Downloads/5917_source_hashes.txt  plb_hash.txt 
--- /home/pbossart/Downloads/5917_source_hashes.txt 2022-06-14 16:04:19.667677436 -0500
+++ plb_hash.txt    2022-06-14 16:05:22.211575742 -0500
@@ -134,7 +134,7 @@
 cc203436ad67d5fc42f6205a8815f55c7d061418   src/arch/xtensa/hal/mp_asm.S
 bacbfc6ff0c71b0aba8bb31c16ed345feefdfacc   src/arch/xtensa/hal/mpu.c
 a2a544bd354d6cf4c30c8bb326ec1173694bc39c   src/arch/xtensa/hal/mpu_asm.S
-b1b53ed4ab216f6a0c8e7c628d93de627ac370b1   src/arch/xtensa/hal/set_region_translate.c
+27ed6b80a50b1b89f7f8b7653355f25ad0cb9932   src/arch/xtensa/hal/set_region_translate.c
 316ddb4e829827a7b1637415030a1c2c37121e07   src/arch/xtensa/hal/state.c
 108986228584696b4c6a235fecd0760f8b4c2ca7   src/arch/xtensa/hal/state_asm.S
 0716ddca17ff2586d31b94c3ab1d5bca14377355   src/arch/xtensa/hal/syscache_asm.S
@@ -158,7 +158,7 @@
 44874cd946df8d92f28241df54854a73ecdec15c   src/arch/xtensa/include/arch/spinlock.h
 1172cb488b88d4966b5f8a4b57d9deb9c3bcfddc   src/arch/xtensa/include/arch/string.h
 c6b04a250575a0f9616d4c59385db05311b35163   src/arch/xtensa/include/xtensa/board.h
-4b17987ea95c462625f792507f624e241776fa0e   src/arch/xtensa/include/xtensa/c6x-compat.h
+ca91bd7183971221923b6e882796e88d0acfb3cd   src/arch/xtensa/include/xtensa/c6x-compat.h
 9cb2c8fcc6b85f6d21d78ad0755285a8fe5d27f7   src/arch/xtensa/include/xtensa/cacheasm.h
 211803aedbf39318f912fcc291efafad970f78ef   src/arch/xtensa/include/xtensa/cacheattrasm.h
 f5bb44faf2ab30bdb3de131f86cdb420b244cec6   src/arch/xtensa/include/xtensa/config/core.h

Absolutely no idea what this is.

plbossart commented 2 years ago

one possibility is that I don't have gcc installed for TGL. I never use GCC anyways even for older hardware.

plbossart commented 2 years ago

git --version git version 2.34.1

marc-hb commented 2 years ago

Only 2 files are different, all others the same?

Can you please run these:


git hash-object src/arch/xtensa/hal/set_region_translate.c
b1b53ed4ab216f6a0c8e7c628d93de627ac370b1

md5sum src/arch/xtensa/hal/set_region_translate.c
36cae0b29a2c1b3a65f1e9dbb3bb829b  src/arch/xtensa/hal/set_region_translate.c

git cat-file -p 27ed6b80a50b1b89f7f8b7653355f25ad0cb9932 | md5sum

git cat-file -p b1b53ed4ab216f6a0c8e7c628d93de627ac370b1 | md5sum
36cae0b29a2c1b3a65f1e9dbb3bb829b  -

wget https://raw.githubusercontent.com/thesofproject/sof/b4886bebbe49454850d59f1a49a0460e590db71c/src/arch/xtensa/hal/set_region_translate.c

diff -u set_region_translate.c src/arch/xtensa/hal/set_region_translate.c 
plbossart commented 2 years ago
git hash-object src/arch/xtensa/hal/set_region_translate.c
27ed6b80a50b1b89f7f8b7653355f25ad0cb9932

md5sum src/arch/xtensa/hal/set_region_translate.c
36cae0b29a2c1b3a65f1e9dbb3bb829b  src/arch/xtensa/hal/set_region_translate.c

git cat-file -p 27ed6b80a50b1b89f7f8b7653355f25ad0cb9932 | md5sum
fatal: Not a valid object name 27ed6b80a50b1b89f7f8b7653355f25ad0cb9932
d41d8cd98f00b204e9800998ecf8427e  -

git cat-file -p b1b53ed4ab216f6a0c8e7c628d93de627ac370b1 | md5sum
36cae0b29a2c1b3a65f1e9dbb3bb829b  -

diff -u set_region_translate.c src/arch/xtensa/hal/set_region_translate.c << no diff
plbossart commented 2 years ago

looks like git hash-object provides a different value for the same file?

plbossart commented 2 years ago

I don't know how this would impact the signature though? The sha1 used for the signature should only work with the binary itself.

plbossart commented 2 years ago

git filters issue? https://stackoverflow.com/questions/5290444/why-does-git-hash-object-return-a-different-hash-than-openssl-sha1

plbossart commented 2 years ago

Bingo!

git hash-object --no-filters src/arch/xtensa/hal/set_region_translate.c
b1b53ed4ab216f6a0c8e7c628d93de627ac370b1
marc-hb commented 2 years ago

I don't know how this would impact the signature though? The sha1 used for the signature should only work with the binary itself.

Agreed, this should not affect the rest of the build. It's still a serious bug though because: 1. it breaks the logger dictionary checksum; 2. it makes troubleshooting other issues much more complicated.

plbossart commented 2 years ago

after breaking audio since 1997, I just started a new career with crypto. Bitcoin, here I come :-)

marc-hb commented 2 years ago

OK, these two files and only these two have Windows end of lines:

find * -exec dos2unix {} \; # DONT DO THIS AT HOME
git diff --stat
 src/arch/xtensa/hal/set_region_translate.c  | 1068 +++++++++++++--------------
 src/arch/xtensa/include/xtensa/c6x-compat.h | 3516 +++++++++++++++++++++++++++++++++++++++++++--------------------------------------------
 2 files changed, 2292 insertions(+), 2292 deletions(-)

core.autocrlf is evil, never use it. Use a decent, polyglot editor instead.

Some repos have .sh and .bat files in the same repo, how does core.autocrlf support that? It does not! Don't use it, it's evil.

plbossart commented 2 years ago

I must have had this in ~.gitconfig since forever. I have no idea why it was added. probably wiki copy paste.

[core]
    autocrlf = input
    filemode = false
    editor = emacs

I never asked for Windows end of files to be supported - and our scripts should not assume anything. Can we add --nofilters to avoid such user-level variations.

plbossart commented 2 years ago

Add the --no-filters, but that doesn't help with the boot issue.

plbossart commented 2 years ago

For the record, removing autocrlf in ~/.gitconfig also solves the issue

-- GIT_TAG / GIT_LOG_HASH : v2.0-rc1-997-gb4886bebbe49 / b4886bebb
-- Source content hash: 1c511e77. Note: by design, source hash is broken by config changes. See thesofproject/sof#3890.

same as reported above by @fredoh9

marc-hb commented 2 years ago

I know what the next step should be but I just lost access to the system I was using to test for some unknown reason :-( I will share the next step as soon as I get access again.

aiChaoSONG commented 2 years ago

Hashes are aligned with @marc-hb on my side, but I got the same boot failure just like @plbossart.

I tried to use meu to do the signing, PRIVATE_KEY_OPTION='-DMEU_PRIVATE_KEY=my_sof_path/keys/otc_private_key_3k.pem' ./scripts/xtensa-build-all.sh -m my_meu_path tgl, the firmware can boot on UpExtreme i11. So I think it is an issue of rimage, but there is no recent rimage update.

According to my test, I think the issue is caused by the crypto library, Ubuntu22.04 upgrade the openssl and libcrypto to version 3.x, while Ubuntu 20.04 and 18.04 use version 1.x.

[Lucky guess] CI and Other developers stay with Ubuntu20.04/18.04, so they don't see the issue, but Pierre and I upgrade to Ubuntu 22.04, so we see the same issue.

https://discourse.ubuntu.com/t/openssl-3-0-transition-plans/24453

// My Ubuntu22.04
➜  sof git:(main) ✗ ldd build_tgl_xcc/rimage_ep/build/rimage
        linux-vdso.so.1 (0x00007fff5eff5000)
        libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007f4dad189000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4dacf61000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f4dad618000)
// CI build machine with Ubuntu20.04
$ ldd rimage 
    linux-vdso.so.1 (0x00007ffcf2be9000)
    libcrypto.so.1.1 => /lib/x86_64-linux-gnu/libcrypto.so.1.1 (0x00007fa790b09000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa790917000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fa790911000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fa7908ee000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fa790e10000)
marc-hb commented 2 years ago

Thanks @aiChaoSONG , very interesting openssl guess. We know rimage is a quite fragile (https://github.com/thesofproject/sof/issues/8680)

If it's only an rimage problem then everyone should see the same sha256sum reproducible.ri below. Both @fredoh9 and I get these with XCC, what about you? The file attached by @plbossart above seems to have small differences in both signature/manifest and in the code.

xtensa-build-all -d tgl

sha256sum reproducible.ri
25f6c86d52dff278c60faf378639277a2d7e22ebb822d042bbd3ae428b4656bc reproducible.ri

without -d

sha256sum reproducible.ri
df01b254e818a7fed8c5c39b1cf37a6e042c09795dcd3d38e39bfc6728623a26 reproducible.ri
aiChaoSONG commented 2 years ago

@marc-hb I got the same sha256sum as yours

marc-hb commented 2 years ago

Good news, thanks! Hopefully @plbossart will also have the same reproducible.ri now that he has removed core.autocrlf and aligned hashes embedded in the binary.

@aiChaoSONG can you please share the output of ./tools/sof_ri_info/sof_ri_info.py build_tgl_xcc/sof.ri for the non -d build? The signature part will be random (just try to build twice and you will see) but most of the rest should be deterministic.

marc-hb commented 2 years ago

Also, can you please try to sign with valgrind rimage and see if there's any error?

aiChaoSONG commented 2 years ago

@marc-hb

There is an assertion failure in the rimage.

sof_ri_info for none -d built sof.ri ``` ➜ sof git:(main) ✗ ./tools/sof_ri_info/sof_ri_info.py build_tgl_xcc/sof.ri SOF Binary build_tgl_xcc/sof.ri size 0x82300 Extended Manifest ver 1.0.0 length 768 CSE Manifest ver 0x102 checksum 0x0 partition name ADSP ADSP.man (CSS Manifest) type 0x4 file offset 0x35c hdr_len 900 ver 0x21000 date 2022/06/15 Rsvd0 0x0 Modulus size (dwords) 96 6b 75 ed 58 20 08 85 95 ... 55 d1 7d c6 0d 79 12 a9 (Community 3k key) Exponent size (dwords) 1 01 00 01 00 Signature (file offset 0x560, length 0x180) bc 82 30 c5 09 45 2d 3a ... 6e 20 78 e8 7e 30 1a 5e Plat Fw Auth Extension type 0xf file offset 0x6e0 length 0x78 name ADSP vcn 0x0 bitmap 00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00 svn 0x0 Other Extension type 0x16 file offset 0x758 length 0x68 cavs0015.met (ADSP Metadata File Extension) type 0x11 file offset 0x7c0 length 0x70 ver 0x0 base offset 0x30f7628d limit offset 0xf5f479cf IMR type 0x3 Attributes d6 6c 05 2d d1 76 5c d0 00 20 00 00 c0 3a 08 00 cavs0015 cavs0015 (ADSP Manifest) file offset 0x2300 name ADSPFW build ver 2.0.0.1 feature mask 0xffff image flags 0x0 HW buffers base address 0x0 length 0x0 Load offset 0x30000 BRNGUP 2b79e4f3-4675-f649-89df-3bc194a91aeb entry point 0xb0038000 type 0x21 ( loadable LL ) cfg offset 0 count 0 affinity 0x3 instance max count 1 stack size 0x1 .text 0xb0038000 file offset 0x8000 flags 0x1001f ( contents alloc load readonly code type=0 pages=1 ) .rodata 0xb0039000 file offset 0x9000 flags 0x1012f ( contents alloc load readonly data type=1 pages=1 ) .bss 0x0 file offset 0x0 flags 0xf00 ( type=15 pages=0 ) BASEFW 0e398c32-5ade-ba4b-93b1-c50432280ee4 entry point 0xbe02c400 type 0x21 ( loadable LL ) cfg offset 0 count 0 affinity 0x3 instance max count 1 stack size 0x1 .text 0xbe02c000 file offset 0xa000 flags 0x2f001f ( contents alloc load readonly code type=0 pages=47 ) .rodata 0xbe05b000 file offset 0x39000 flags 0x49012f ( contents alloc load readonly data type=1 pages=73 ) .bss 0xbe0a4000 file offset 0x0 flags 0x23c0202 ( alloc type=2 pages=572 ) Memory layout undefined ```
check rimage signing process with valgrind ``` ➜ sof git:(main) ✗ valgrind ./build_tgl_xcc/rimage_ep/build/rimage -o sof-tgl.ri -c rimage/config/tgl.toml -s 1344 -k keys/otc_private_key_3k.pem -i 3 -f 0.0.0 -b 0 -e build_tgl_xcc/src/arch/xtensa/bootloader-tgl build_tgl_xcc/sof ==1333350== Memcheck, a memory error detector ==1333350== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==1333350== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==1333350== Command: ./build_tgl_xcc/rimage_ep/build/rimage -o sof-tgl.ri -c rimage/config/tgl.toml -s 1344 -k keys/otc_private_key_3k.pem -i 3 -f 0.0.0 -b 0 -e build_tgl_xcc/src/arch/xtensa/bootloader-tgl build_tgl_xcc/sof ==1333350== Module Reading build_tgl_xcc/src/arch/xtensa/bootloader-tgl info: ignore .bss section for bootloader module Found 18 sections, listing valid sections...... No LMA VMA End Size Type Name 0 0x00000000 0x00000000 0x00000000 0x0 1 0xb0038000 0xb0038000 0xb00380fe 0xfe TEXT .boot_entry.text 2 0xb0038120 0xb0038120 0xb0038140 0x20 TEXT .boot_entry.literal 3 0xb0038150 0xb0038150 0xb0038cb9 0xb69 TEXT .text 4 0xb0039000 0xb0039000 0xb0039010 0x10 DATA .rodata module: input size 3223 (0xc97) bytes 4 sections module: text 3207 (0xc87) bytes data 16 (0x10) bytes bss 0 (0x0) bytes Module Reading build_tgl_xcc/sof Found 43 sections, listing valid sections...... No LMA VMA End Size Type Name 2 0xbe00c000 0xbe00c000 0xbe02c000 0x20000 HEAP .buffer_hp_heap 3 0xbe004000 0xbe004000 0xbe006000 0x2000 HEAP .wnd0 4 0xbe006000 0xbe006000 0xbe008000 0x2000 HEAP .wnd1 5 0xbe008000 0xbe008000 0xbe00a000 0x2000 HEAP .wnd2 6 0xbe00a000 0xbe00a000 0xbe00c000 0x2000 HEAP .wnd3 7 0xbe02c000 0xbe02c000 0xbe02c16a 0x16a TEXT .WindowVectors.text 8 0xbe02c180 0xbe02c180 0xbe02c186 0x6 TEXT .Level2InterruptVector.text 9 0xbe02c240 0xbe02c240 0xbe02c246 0x6 TEXT .Level5InterruptVector.text 10 0xbe02c280 0xbe02c280 0xbe02c286 0x6 TEXT .DebugExceptionVector.text 11 0xbe02c2c0 0xbe02c2c0 0xbe02c2c3 0x3 TEXT .NMIExceptionVector.text 12 0xbe02c300 0xbe02c300 0xbe02c306 0x6 TEXT .KernelExceptionVector.text 13 0xbe02c338 0xbe02c338 0xbe02c33c 0x4 TEXT .UserExceptionVector.literal 14 0xbe02c340 0xbe02c340 0xbe02c357 0x17 TEXT .UserExceptionVector.text 15 0xbe02c3c0 0xbe02c3c0 0xbe02c3c6 0x6 TEXT .DoubleExceptionVector.text 16 0xbe02c400 0xbe02c400 0xbe059efc 0x2dafc TEXT .text 18 0xbe059f00 0xbe800000 0xbe800120 0x120 TEXT .AlternateResetVector.text 19 0xbe05a020 0xbe800180 0xbe800190 0x10 TEXT .AlternateResetL2IntVector.text 20 0xbe05a030 0xbe800190 0xbe800270 0xe0 TEXT .LpsramCode.text 21 0xbe05b000 0xbe05b000 0xbe077ecc 0x1cecc DATA .rodata 22 0xbe077ecc 0xbe077ecc 0xbe077f08 0x3c DATA .module_init 23 0xbe077f40 0xbe077f40 0xbe0a2f40 0x2b000 DATA .shared_data 24 0xbe0a2f40 0xbe0a2f40 0xbe0a3ef8 0xfb8 DATA .data 25 0xbe0a3ef8 0xbe0a3ef8 0xbe0a3f64 0x6c DATA .fw_ready 26 0xbe0a3f68 0xbe0a3f68 0xbe0a3f90 0x28 DATA .AltBootManifest 27 0xbe0a4000 0xbe0a4000 0xbe2e0000 0x23c000 BSS .bss module: input size 486924 (0x76e0c) bytes 27 sections module: text 188088 (0x2deb8) bytes data 298836 (0x48f54) bytes bss 2506752 (0x264000) bytes Module Write: build_tgl_xcc/src/arch/xtensa/bootloader-tgl Manifest module metadata section at index 14 Entry point 0xb0038000 Totals Start End Size TEXT 0xb0038000 0xb0038cb9 0xcb9 DATA 0xb0039000 0xb0039010 0x10 BSS 0x00000000 0x00000000 0x0 No Address Size File Type 1 0xb0038000 0xfe 0x8000 TEXT 2 0xb0038120 0x20 0x8120 TEXT 3 0xb0038150 0xb69 0x8150 TEXT 4 0xb0039000 0x10 0x9000 DATA Total pages text 1 data 1 bss 0 module file limit: 0xa000 Module Write: build_tgl_xcc/sof warning: can't find section named '.module' in module build_tgl_xcc/sof Firmware completing manifest v2.5 meta: completing ADSP manifest meta: limit is 0x1ac0 rimage: /home/chao/work/sof/rimage/src/hash.c:103: ri_sha384: Assertion `(uint64_t)size + offset <= image->adsp->image_size' failed. ==1333350== ==1333350== Process terminating with default action of signal 6 (SIGABRT) ==1333350== at 0x4D5CA7C: __pthread_kill_implementation (pthread_kill.c:44) ==1333350== by 0x4D5CA7C: __pthread_kill_internal (pthread_kill.c:78) ==1333350== by 0x4D5CA7C: pthread_kill@@GLIBC_2.34 (pthread_kill.c:89) ==1333350== by 0x4D08475: raise (raise.c:26) ==1333350== by 0x4CEE7F2: abort (abort.c:79) ==1333350== by 0x4CEE71A: __assert_fail_base.cold (assert.c:92) ==1333350== by 0x4CFFE95: __assert_fail (assert.c:101) ==1333350== by 0x10CB13: ri_sha384 (hash.c:103) ==1333350== by 0x11158A: man_write_fw_meu_v2_5 (manifest.c:1214) ==1333350== by 0x114FB2: main (rimage.c:208) ==1333350== ==1333350== HEAP SUMMARY: ==1333350== in use at exit: 3,362,975 bytes in 3,035 blocks ==1333350== total heap usage: 5,071 allocs, 2,036 frees, 3,476,232 bytes allocated ==1333350== ==1333350== LEAK SUMMARY: ==1333350== definitely lost: 0 bytes in 0 blocks ==1333350== indirectly lost: 0 bytes in 0 blocks ==1333350== possibly lost: 0 bytes in 0 blocks ==1333350== still reachable: 3,362,975 bytes in 3,035 blocks ==1333350== suppressed: 0 bytes in 0 blocks ==1333350== Rerun with --leak-check=full to see details of leaked memory ==1333350== ==1333350== For lists of detected and suppressed errors, rerun with: -s ==1333350== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) [1] 1333350 IOT instruction (core dumped) valgrind ./build_tgl_xcc/rimage_ep/build/rimage -o sof-tgl.ri -c -s 1344 -k ```
marc-hb commented 2 years ago

Thanks @aiChaoSONG , does this assert fail only when running with valgrind? (and only with newer OpenSSL?)

sof/rimage/src/hash.c:103: ri_sha384: Assertion `(uint64_t)size + offset <= image->adsp->image_size' failed.

==1333350==    by 0x10CB13: ri_sha384 (hash.c:103)
==1333350==    by 0x11158A: man_write_fw_meu_v2_5 (manifest.c:1214)
==1333350==    by 0x114FB2: main (rimage.c:208)
marc-hb commented 2 years ago

./tools/sof_ri_info/sof_ri_info.py build_tgl_xcc/sof.ri

Below is the diff -wb -U10 between my output and @aiChaoSONG 's. The signature randomness is expected, the other differences most likely not.

I ran the same command on @fredoh9's build and only his signature differs with mine.

@plbossart can you please run this sof_ri_info.py command and compare?

If there is some memory corruption with rimage+openssl3 then all bets are off, anything can happen.

--- mine    2022-06-15 10:44:00.308980601 -0700
+++ chao    2022-06-15 10:44:16.514391243 -0700
@@ -1,47 +1,47 @@
-SOF Binary build_tgl_xcc/sof.ri size 0x81300
+SOF Binary build_tgl_xcc/sof.ri size 0x82300

   Extended Manifest ver 1.0.0 length 768

   CSE Manifest ver 0x102 checksum 0x0 partition name ADSP

     ADSP.man (CSS Manifest) type 0x4 file offset 0x35c hdr_len 900 ver 0x21000 date 2022/06/15
       Rsvd0 0x0
       Modulus size (dwords) 96
         6b 75 ed 58 20 08 85 95 ... 55 d1 7d c6 0d 79 12 a9 (Community 3k key)
       Exponent size (dwords) 1
         01 00 01 00
       Signature (file offset 0x560, length 0x180)
-        8c 42 36 21 c1 5c 5a e6 ... 02 83 81 7f 4c af 01 5a
+        bc 82 30 c5 09 45 2d 3a ... 6e 20 78 e8 7e 30 1a 5e

       Plat Fw Auth Extension type 0xf file offset 0x6e0 length 0x78
        name ADSP vcn 0x0 bitmap 00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00 svn 0x0

       Other Extension type 0x16 file offset 0x758 length 0x68

     cavs0015.met (ADSP Metadata File Extension) type 0x11 file offset 0x7c0 length 0x70
-     ver 0x0 base offset 0xfe15179d limit offset 0x5b667f21
+     ver 0x0 base offset 0x30f7628d limit offset 0xf5f479cf
       IMR type 0x3
       Attributes
-        9c 06 84 92 54 50 c5 49 00 20 00 00 c0 2a 08 00
+        d6 6c 05 2d d1 76 5c d0 00 20 00 00 c0 3a 08 00

     cavs0015

   cavs0015 (ADSP Manifest) file offset 0x2300 name ADSPFW build ver 2.0.0.1 feature mask 0xffff image flags 0x0
     HW buffers base address 0x0 length 0x0
     Load offset 0x30000

     BRNGUP    2b79e4f3-4675-f649-89df-3bc194a91aeb
       entry point 0xb0038000 type 0x21 ( loadable LL )
       cfg offset 0 count 0 affinity 0x3 instance max count 1 stack size 0x1
       .text   0xb0038000 file offset 0x8000 flags 0x1001f ( contents alloc load readonly code type=0 pages=1 )
       .rodata 0xb0039000 file offset 0x9000 flags 0x1012f ( contents alloc load readonly data type=1 pages=1 )
       .bss    0x0 file offset 0x0 flags 0xf00 ( type=15 pages=0 )

     BASEFW    0e398c32-5ade-ba4b-93b1-c50432280ee4
       entry point 0xbe02c400 type 0x21 ( loadable LL )
       cfg offset 0 count 0 affinity 0x3 instance max count 1 stack size 0x1
-      .text   0xbe02c000 file offset 0xa000 flags 0x2e001f ( contents alloc load readonly code type=0 pages=46 )
-      .rodata 0xbe05a000 file offset 0x38000 flags 0x49012f ( contents alloc load readonly data type=1 pages=73 )
-      .bss    0xbe0a3000 file offset 0x0 flags 0x23d0202 ( alloc type=2 pages=573 )
+      .text   0xbe02c000 file offset 0xa000 flags 0x2f001f ( contents alloc load readonly code type=0 pages=47 )
+      .rodata 0xbe05b000 file offset 0x39000 flags 0x49012f ( contents alloc load readonly data type=1 pages=73 )
+      .bss    0xbe0a4000 file offset 0x0 flags 0x23c0202 ( alloc type=2 pages=572 )

 Memory layout undefined
marc-hb commented 2 years ago

@plbossart please also try this


--- a/src/arch/xtensa/CMakeLists.txt
+++ b/src/arch/xtensa/CMakeLists.txt
@@ -127,7 +127,7 @@ separate_arguments(EXTRA_CFLAGS_AS_LIST  NATIVE_COMMAND  ${EXTRA_CFLAGS})
 # de-duplication "feature"
 target_compile_options(sof_options INTERFACE
        $<$<COMPILE_LANGUAGE:C>:
-               -${optimization_flag} -g
+               -${optimization_flag} -g0
                -Wall -Werror
                -Wl,-EL
                -Wmissing-prototypes
@@ -449,11 +449,11 @@ if(MEU_PATH OR DEFINED MEU_NO_SIGN) # Don't sign with rimage

        # Passing -s ${MEU_OFFSET} disables rimage signing and produces
        # one .uns file and one .met file instead of a .ri file.
        add_custom_target(
                run_rimage
-               COMMAND ${PROJECT_BINARY_DIR}/rimage_ep/build/rimage
+               COMMAND valgrind ${PROJECT_BINARY_DIR}/rimage_ep/build/rimage
                        -o sof-${fw_name}.ri
                        -c "${PROJECT_SOURCE_DIR}/rimage/config/${fw_name}.toml"
                        -s ${MEU_OFFSET}
                        -k ${RIMAGE_PRIVATE_KEY}
                        -i ${RIMAGE_IMR_TYPE}
@@ -491,11 +491,11 @@
                )
        endif()
 else() # sign with rimage
        add_custom_target(
                run_rimage
-               COMMAND ${PROJECT_BINARY_DIR}/rimage_ep/build/rimage
+               COMMAND valgrind ${PROJECT_BINARY_DIR}/rimage_ep/build/rimage
                        -o sof-${fw_name}.ri
                        -c "${PROJECT_SOURCE_DIR}/rimage/config/${fw_name}.toml"
                        -k ${RIMAGE_PRIVATE_KEY}
                        -i ${RIMAGE_IMR_TYPE}
                        -f ${SOF_MAJOR}.${SOF_MINOR}.${SOF_MICRO}

EDIT: not using valgrind is a waste of time.

Bonus feature: adding valgrind is the easiest way to print the rimage command line whereas a verbose build is crazy verbose.

juimonen commented 2 years ago

@aiChaoSONG your last parameter to rimage should be: build_tgl_xcc/src/arch/xtensa/build_tgl_xcc/sof-tgl

it is now wrong and you are bailing out even before signing...

with last parameter fixed I can sign with openssl3 and valgrind looks clean (I think @marc-hb also found that).

so we really need to look at the signing differences in the image.

plbossart commented 2 years ago

@plbossart can you please run this sof_ri_info.py command and compare?

@marc-hb I am afraid I have another signature, not the same as @aiChaoSONG

sof_ri_info ```` SOF Binary build_tgl_xcc/sof.ri size 0x82300 Extended Manifest ver 1.0.0 length 768 CSE Manifest ver 0x102 checksum 0x0 partition name ADSP ADSP.man (CSS Manifest) type 0x4 file offset 0x35c hdr_len 900 ver 0x21000 date 2022/06/15 Rsvd0 0x0 Modulus size (dwords) 96 6b 75 ed 58 20 08 85 95 ... 55 d1 7d c6 0d 79 12 a9 (Community 3k key) Exponent size (dwords) 1 01 00 01 00 Signature (file offset 0x560, length 0x180) bc 18 72 5e 87 53 70 06 ... 2c 6c 20 cd c9 43 ab 72 Plat Fw Auth Extension type 0xf file offset 0x6e0 length 0x78 name ADSP vcn 0x0 bitmap 00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00 svn 0x0 Other Extension type 0x16 file offset 0x758 length 0x68 cavs0015.met (ADSP Metadata File Extension) type 0x11 file offset 0x7c0 length 0x70 ver 0x0 base offset 0x97e32029 limit offset 0xc60fedb4 IMR type 0x3 Attributes 5a 9c b4 4a 6b e7 62 03 00 20 00 00 c0 3a 08 00 cavs0015 cavs0015 (ADSP Manifest) file offset 0x2300 name ADSPFW build ver 2.0.0.1 feature mask 0xffff image flags 0x0 HW buffers base address 0x0 length 0x0 Load offset 0x30000 BRNGUP 2b79e4f3-4675-f649-89df-3bc194a91aeb entry point 0xb0038000 type 0x21 ( loadable LL ) cfg offset 0 count 0 affinity 0x3 instance max count 1 stack size 0x1 .text 0xb0038000 file offset 0x8000 flags 0x1001f ( contents alloc load readonly code type=0 pages=1 ) .rodata 0xb0039000 file offset 0x9000 flags 0x1012f ( contents alloc load readonly data type=1 pages=1 ) .bss 0x0 file offset 0x0 flags 0xf00 ( type=15 pages=0 ) BASEFW 0e398c32-5ade-ba4b-93b1-c50432280ee4 entry point 0xbe02c400 type 0x21 ( loadable LL ) cfg offset 0 count 0 affinity 0x3 instance max count 1 stack size 0x1 .text 0xbe02c000 file offset 0xa000 flags 0x2f001f ( contents alloc load readonly code type=0 pages=47 ) .rodata 0xbe05b000 file offset 0x39000 flags 0x49012f ( contents alloc load readonly data type=1 pages=73 ) .bss 0xbe0a4000 file offset 0x0 flags 0x23c0202 ( alloc type=2 pages=572 ) ````

plb.log

--- chao.log    2022-06-15 13:51:02.061698055 -0500
+++ plb.log 2022-06-15 13:51:41.166699970 -0500
@@ -4,32 +4,32 @@

   CSE Manifest ver 0x102 checksum 0x0 partition name ADSP

     ADSP.man (CSS Manifest) type 0x4 file offset 0x35c hdr_len 900 ver 0x21000 date 2022/06/15
       Rsvd0 0x0
       Modulus size (dwords) 96
         6b 75 ed 58 20 08 85 95 ... 55 d1 7d c6 0d 79 12 a9 (Community 3k key)
       Exponent size (dwords) 1
         01 00 01 00
       Signature (file offset 0x560, length 0x180)
-        bc 82 30 c5 09 45 2d 3a ... 6e 20 78 e8 7e 30 1a 5e
+        bc 18 72 5e 87 53 70 06 ... 2c 6c 20 cd c9 43 ab 72

       Plat Fw Auth Extension type 0xf file offset 0x6e0 length 0x78
        name ADSP vcn 0x0 bitmap 00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00 svn 0x0

       Other Extension type 0x16 file offset 0x758 length 0x68

     cavs0015.met (ADSP Metadata File Extension) type 0x11 file offset 0x7c0 length 0x70
-     ver 0x0 base offset 0x30f7628d limit offset 0xf5f479cf
+     ver 0x0 base offset 0x97e32029 limit offset 0xc60fedb4
       IMR type 0x3
       Attributes
-        d6 6c 05 2d d1 76 5c d0 00 20 00 00 c0 3a 08 00
+        5a 9c b4 4a 6b e7 62 03 00 20 00 00 c0 3a 08 00

     cavs0015

   cavs0015 (ADSP Manifest) file offset 0x2300 name ADSPFW build ver 2.0.0.1 feature mask 0xffff image flags 0x0
     HW buffers base address 0x0 length 0x0
     Load offset 0x30000

     BRNGUP    2b79e4f3-4675-f649-89df-3bc194a91aeb
       entry point 0xb0038000 type 0x21 ( loadable LL )
       cfg offset 0 count 0 affinity 0x3 instance max count 1 stack size 0x1
plbossart commented 2 years ago

@marc-hb No real issue detected with valgrind + -g0, and no luck - same boot failure.

log.txt

plbossart commented 2 years ago

FWIW on my device I have this:

openssl version OpenSSL 3.0.2 15 Mar 2022 (Library: OpenSSL 3.0.2 15 Mar 2022)

Edit: ldd build_tgl_xcc/rimage_ep/build/rimage linux-vdso.so.1 (0x00007ffc7ddf5000) libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007fbe20223000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fbe1fffb000) /lib64/ld-linux-x86-64.so.2 (0x00007fbe206a0000)

marc-hb commented 2 years ago

Thanks, can you make sure you have the same reproducible.ri sha256sum as everyone else? (after disabling core.autocrlf). Copying them from my comment 13 hours ago:

XCC xtensa-build-all -d tgl

sha256sum reproducible.ri
25f6c86d52dff278c60faf378639277a2d7e22ebb822d042bbd3ae428b4656bc reproducible.ri

without -d

sha256sum reproducible.ri
df01b254e818a7fed8c5c39b1cf37a6e042c09795dcd3d38e39bfc6728623a26 reproducible.ri

If they're the same then we are 100% sure this is a pure signing or manifest issue.

marc-hb commented 2 years ago

As a workaround please try the Docker build (without XCC)

docker pull thesofproject/sof # get coffee
./scripts/docker-run.sh openssl version
 # OpenSSL 1.1.1f  31 Mar 2020

./scripts/docker-run.sh ./scripts/xtensa-build-all.sh tgl
plbossart commented 2 years ago

yep, same reproducible -> pure signature issue.

with -d
sha256sum reproducible.ri
25f6c86d52dff278c60faf378639277a2d7e22ebb822d042bbd3ae428b4656bc reproducible.ri

without -d
sha256sum reproducible.ri
df01b254e818a7fed8c5c39b1cf37a6e042c09795dcd3d38e39bfc6728623a26 reproducible.ri
plbossart commented 2 years ago
> docker pull thesofproject/sof # get coffee

instant coffee with Google Fiber :-)

> ./scripts/docker-run.sh openssl version
>  # OpenSSL 1.1.1f  31 Mar 2020
> 
> ./scripts/docker-run.sh ./scripts/xtensa-build-all.sh tgl

works fine, the firmware boots on Up Extreme11.

aiChaoSONG commented 2 years ago

does this assert fail only when running with valgrind

@marc-hb I think I signed the wrong binary (build_tgl_xcc/sof), that's why I have assertion failure. if I use the correct binary (build_tgl_xcc/src/arch/xtensa/sof-tgl), no failure at all. thanks Jaska to correct me.