xiph / opus

Modern audio compression for the internet.
https://opus-codec.org/
Other
2.2k stars 587 forks source link

Compiling libopus 1.5.1 with Linux fails on ARM Cortex-A53 and Cortex-A55 #323

Open christiancelona opened 4 months ago

christiancelona commented 4 months ago

I encountered the problem with GCC, in function silk_NSQ_del_dec_neon and silk/arm/NSQ_del_dec_neon_intr which invokes undefined behavior [-Waggressive-loop-optimizations], the errors occurred with Raspberry Pi Zero for Cortex-A53 and Raxda Rock 3 model C for Cortex-A55.

I have not found any problems in RV64 (with StarFive VisionFive 2), with recent Macs and PCs.

I leave you my compliments for the results obtained with encapsulation in ISOBMFF, which seems to work everywhere, now Opus is in my opinion the best choice for the online distribution of spoken content at low bitrate. Thank you!

jmvalin commented 4 months ago

Do you have any sort of information that can help reproducing the problem?

christiancelona commented 4 months ago

Raxda Rock 3 model C (Cortex-A55, Rockchip RK3566)

DietPi v9.2.1 (beta) Debian GNU/Linux trixie/sid

CC silk/arm/NSQ_del_dec_neon_intr.lo silk/arm/NSQ_del_dec_neon_intr.c: In function ‘silk_NSQ_del_dec_neon’: silk/arm/NSQ_del_dec_neon_intr.c:424:55: warning: iteration 80 invokes undefined behavior [-Waggressive-loop-optimizations] 424 | NSQ->sLPC_Q14[ i ] = psDelDec->sLPC_Q14[ i ][ Winner_ind ]; | ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~ silk/arm/NSQ_del_dec_neon_intr.c:423:18: note: within this loop 423 | for( ; i < NSQ_LPC_BUF_LENGTH; i++ ) {

christiancelona commented 4 months ago

Raspberry Pi Zero (Cortex-A53, Broadcom BCM2835)

DietPi v9.2.1 (beta) Debian GNU/Linux trixie/sid

CC silk/enc_API.lo silk/enc_API.c: In function ‘silk_Encode’: silk/enc_API.c:398:13: warning: ‘silk_HP_variable_cutoff’ accessing 10064 bytes in a region of size 8 [-Wstringop-overflow=] 398 | silk_HP_variable_cutoff( psEnc->state_Fxx ); | ^~~~~~~~~~~ silk/enc_API.c:398:13: note: referencing argument 1 of type ‘silk_encoder_state_FLP[0]’ In file included from silk/enc_API.c:41: ./silk/float/main_FLP.h:53:6: note: in a call to function ‘silk_HP_variable_cutoff’ 53 | void silk_HP_variable_cutoff( | ^~~~~~~ silk/enc_API.c:398:13: warning: ‘silk_HP_variable_cutoff’ accessing 10064 bytes in a region of size 8 [-Wstringop-overflow=] 398 | silk_HP_variable_cutoff( psEnc->state_Fxx ); | ^~~~~~~~~~~ silk/enc_API.c:398:13: note: referencing argument 1 of type ‘silk_encoder_state_FLP[0]’ ./silk/float/main_FLP.h:53:6: note: in a call to function ‘silk_HP_variable_cutoff’ 53 | void silk_HP_variable_cutoff( | ^~~~~~~ silk/enc_API.c:398:13: warning: ‘silk_HP_variable_cutoff’ accessing 10064 bytes in a region of size 8 [-Wstringop-overflow=] 398 | silk_HP_variable_cutoff( psEnc->state_Fxx ); | ^~~~~~~~~~~ silk/enc_API.c:398:13: note: referencing argument 1 of type ‘silk_encoder_state_FLP[0]’ ./silk/float/main_FLP.h:53:6: note: in a call to function ‘silk_HP_variable_cutoff’ 53 | void silk_HP_variable_cutoff( | ^~~~~~~ silk/enc_API.c:398:13: warning: ‘silk_HP_variable_cutoff’ accessing 10064 bytes in a region of size 8 [-Wstringop-overflow=] 398 | silk_HP_variable_cutoff( psEnc->state_Fxx ); | ^~~~~~~~~~~ silk/enc_API.c:398:13: note: referencing argument 1 of type ‘silk_encoder_state_FLP[0]’ ./silk/float/main_FLP.h:53:6: note: in a call to function ‘silk_HP_variable_cutoff’ 53 | void silk_HP_variable_cutoff( | ^~~~~~~

...

CC silk/arm/NSQ_del_dec_neon_intr.lo silk/arm/NSQ_del_dec_neon_intr.c: In function ‘silk_NSQ_del_dec_neon’: silk/arm/NSQ_del_dec_neon_intr.c:424:55: warning: iteration 80 invokes undefined behavior [-Waggressive-loop-optimizations] 424 | NSQ->sLPC_Q14[ i ] = psDelDec->sLPC_Q14[ i ][ Winner_ind ]; | ~~~~~^~~~ silk/arm/NSQ_del_dec_neon_intr.c:423:18: note: within this loop 423 | for( ; i < NSQ_LPC_BUF_LENGTH; i++ ) { In file included from ./silk/main.h:31, from silk/arm/NSQ_del_dec_neon_intr.c:36: In function ‘silk_noise_shape_quantizer_del_dec_neon’, inlined from ‘silk_NSQ_del_dec_neon’ at silk/arm/NSQ_del_dec_neon_intr.c:389:13: silk/arm/NSQ_del_dec_neon_intr.c:605:48: warning: iteration 2147483647 invokes undefined behavior [-Waggressive-loop-optimizations] 605 | AR_shp_Q28[i] = silk_LSHIFT32( AR_shp_Q13[i], 15 ); | ^ ./silk/SigProc_FIX.h:503:73: note: in definition of macro ‘silk_LSHIFT32’ 503 | _LSHIFT32(a, shift) ((opus_int32)((opus_uint32)(a)<<(shift))) / shift >= 0, shift < 32 / | ^

silk/arm/NSQ_del_dec_neon_intr.c:604:14: note: within this loop 604 | for( ; i < MAX_SHAPE_LPC_ORDER; i++ ) {

christiancelona commented 4 months ago

Raspberry Pi Zero 2 W (Cortex-A53, Broadcom BCM2710A1, aka RP3A0)

Raspberry Pi OS with desktop Release date: March 15th 2024, System: 64-bit, Kernel version: 6.6, Debian version: 12 (bookworm)

CC silk/arm/NSQ_del_dec_neon_intr.lo silk/arm/NSQ_del_dec_neon_intr.c: In function ‘silk_NSQ_del_dec_neon’: silk/arm/NSQ_del_dec_neon_intr.c:424:55: warning: iteration 80 invokes undefined behavior [-Waggressive-loop-optimizations] 424 | NSQ->sLPC_Q14[ i ] = psDelDec->sLPC_Q14[ i ][ Winner_ind ]; | ~~~~~^~~~ silk/arm/NSQ_del_dec_neon_intr.c:423:18: note: within this loop 423 | for( ; i < NSQ_LPC_BUF_LENGTH; i++ ) {

lscpu

Architecture: aarch64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Vendor ID: ARM Model name: Cortex-A53 Model: 4 Thread(s) per core: 1 Core(s) per cluster: 4 Socket(s): - Cluster(s): 1 Stepping: r0p4 CPU(s) scaling MHz: 60% CPU max MHz: 1000,0000 CPU min MHz: 600,0000 BogoMIPS: 38,40 Flags: fp asimd evtstrm crc32 cpuid Caches (sum of all):
L1d: 128 KiB (4 instances) L1i: 128 KiB (4 instances) L2: 512 KiB (1 instance) Vulnerabilities:
Gather data sampling: Not affected Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Not affected Mmio stale data: Not affected Retbleed: Not affected Spec rstack overflow: Not affected Spec store bypass: Not affected Spectre v1: Mitigation; __user pointer sanitization Spectre v2: Not affected Srbds: Not affected Tsx async abort: Not affected

jmvalin commented 4 months ago

Since Opus is know to work on many ARM environments, the issue is most likely build-related. What build system (and exact command) are you using? Also, make sure you're using the latest main (and not 1.5.1) since there's been many build fixes there already.

christiancelona commented 4 months ago

wget https://downloads.xiph.org/releases/opus/opus-1.5.1.tar.gz tar -xf opus-1.5.1.tar.gz cd opus-1.5.1 sudo apt install autoconf automake libtool gcc make ./configure sudo make install

I will test the latest build on a C906 SoC

jmvalin commented 4 months ago

Looking more closely, these all appear to be warnings, not errors. Does the build actually complete? If not, what are the actual errors?

christiancelona commented 4 months ago

No, at the end of make install, the command opusenc -V report that I'm still using the old library on these SoCs and the newest version on others

jmvalin commented 4 months ago

I mean what's the actual error then?

christiancelona commented 4 months ago

Raspberry Pi Zero 2 W (Cortex-A53, Broadcom BCM2710A1, aka RP3A0)

Raspberry Pi OS with desktop Release date: March 15th 2024, System: 64-bit, Kernel version: 6.6, Debian version: 12 (bookworm)

wget https://github.com/xiph/opus/archive/refs/heads/main.zip unzip main.zip cd opus-main ./autogen.sh ./configure

Compiler support:

  C99 var arrays: ................ yes
  C99 lrintf: .................... yes
  Use alloca: .................... no (using var arrays)

General configuration:

  Floating point support: ........ yes
  Fast float approximations: ..... yes
  Fixed point debugging: ......... no
  Inline Assembly Optimizations: . No inline ASM for your platform, please send patches
  External Assembly Optimizations: 
  Intrinsics Optimizations: ...... ARM (NEON) (NEON Aarch64) (DOTPROD)
  Run-time CPU detection: ........ ARM (DOTPROD Intrinsics)
  Custom modes: .................. no
  Assertion checking: ............ no
  Hardening: ..................... yes
  Fuzzing: ....................... no
  Check ASM: ..................... no

  API documentation: ............. yes
  Extra programs: ................ yes

sudo make install

...

CC silk/arm/NSQ_del_dec_neon_intr.lo silk/arm/NSQ_del_dec_neon_intr.c: In function ‘silk_NSQ_del_dec_neon’: silk/arm/NSQ_del_dec_neon_intr.c:424:55: warning: iteration 80 invokes undefined behavior [-Waggressive-loop-optimizations] 424 | NSQ->sLPC_Q14[ i ] = psDelDec->sLPC_Q14[ i ][ Winner_ind ]; | ~~~~~^~~~ silk/arm/NSQ_del_dec_neon_intr.c:423:18: note: within this loop 423 | for( ; i < NSQ_LPC_BUF_LENGTH; i++ ) {

...

opusenc -V opusenc opus-tools 0.2 (using libopus 1.3.1) Copyright (C) 2008-2018 Xiph.Org Foundation

jmvalin commented 4 months ago

Look, the question is simple. Does the make succeed or does it end with an error. I don't know what you're trying to show with opusenc. opusenc comes from a different package so I have no idea what you expect here. Of course it won't change.

christiancelona commented 4 months ago

MangoPi MQ-Pro (RISC-V 64, Allwinner D1, XuanTie C906 core RV64IMAFDCVU)

DietPi v9.2.1 (beta) Debian GNU/Linux trixie/sid riscv64, Host: Allwinner D1 Nezha, Kernel: 6.1.0-rc3-d1

wget https://github.com/xiph/opus/archive/refs/heads/main.zip unzip main.zip cd opus-main ./autogen.sh ./configure

Compiler support:

  C99 var arrays: ................ yes
  C99 lrintf: .................... yes
  Use alloca: .................... no (using var arrays)

General configuration:

  Floating point support: ........ yes
  Fast float approximations: ..... no
  Fixed point debugging: ......... no
  Inline Assembly Optimizations: . No inline ASM for your platform, please send patches
  External Assembly Optimizations: 
  Intrinsics Optimizations: ...... no
  Run-time CPU detection: ........ no
  Custom modes: .................. no
  Assertion checking: ............ no
  Hardening: ..................... yes
  Fuzzing: ....................... no
  Check ASM: ..................... no

  API documentation: ............. yes
  Extra programs: ................ yes

sudo make install

...

CC silk/stereo_LR_to_MS.lo In file included from /usr/include/string.h:535, from silk/SigProc_FIX.h:40, from silk/main.h:31, from silk/stereo_LR_to_MS.c:32: In function 'memcpy', inlined from 'silk_stereo_LR_to_MS' at silk/stereo_LR_to_MS.c:74:5: /usr/include/riscv64-linux-gnu/bits/string_fortified.h:29:10: warning: 'builtin_memcpy' reading 4 bytes from a region of size 0 [-Wstringop-overread] 29 | return _builtinmemcpy_chk (dest, src, len, | ^~~~~~~~~~~~~ 30 | glibc_objsize0 (__dest)); | ~~~~~~ In file included from silk/stereo_LR_to_MS.c:33: silk/stereo_LR_to_MS.c: In function 'silk_stereo_LR_to_MS': silk/stereo_LR_to_MS.c:61:12: note: at offset [-4294967296, -4] into source object 'side' of size [0, 9223372036854775807] 61 | ALLOC( side, frame_length + 2, opus_int16 ); | ^~~~ ./celt/stack_alloc.h:94:37: note: in definition of macro 'ALLOC' 94 | #define ALLOC(var, size, type) type var[size] | ^~~

...

opusenc -V opusenc opus-tools 0.2 (using libopus 1.4) Copyright (C) 2008-2018 Xiph.Org Foundation

christiancelona commented 4 months ago

I have modified the previous tests, now they indicate also the Linux distribution used. I use opusenc -V just to show in the output that "make install" did not update libopus.

Now I will do one last test, I will put the same DietPi distribution on the RISC-V hardware which previously did not give any "warning" in compilation and we will find out if it is an operating system problem or not.

I am not unaware of the fact that the last test above is done on a product that does not have adequate software support, with the Raspberry Pi I used their Debian distributions, in the ARM or RISC-V hardware the choice of Linux distro is not as wide as on PCs for known reasons.

jmvalin commented 4 months ago

First, is your opusenc even dynamically linked. Second, did you check where it's looking for libopus? If it's linking with /usr/lib/libopus.so.x and you're installing to /usr/local/libopus.so.x then it's not going to help. The dynamic linker isn't magic.

christiancelona commented 4 months ago

───────────────────────────────────────────────────── DietPi v9.2.1 : 01:28 - Tue 03/19/24 ─────────────────────────────────────────────────────

uname -a Linux DietPi 6.1.81 #1 SMP Sat Mar 9 21:40:38 UTC 2024 riscv64 GNU/Linux

wget https://github.com/xiph/opus/archive/refs/heads/main.zip unzip main.zip cd opus-main ./autogen.sh ./configure

Compiler support:

  C99 var arrays: ................ yes
  C99 lrintf: .................... yes
  Use alloca: .................... no (using var arrays)

General configuration:

  Floating point support: ........ yes
  Fast float approximations: ..... no
  Fixed point debugging: ......... no
  Inline Assembly Optimizations: . No inline ASM for your platform, please send patches
  External Assembly Optimizations: 
  Intrinsics Optimizations: ...... no
  Run-time CPU detection: ........ no
  Custom modes: .................. no
  Assertion checking: ............ no
  Hardening: ..................... yes
  Fuzzing: ....................... no
  Check ASM: ..................... no

  API documentation: ............. yes
  Extra programs: ................ yes

sudo make install

...

CC silk/stereo_LR_to_MS.lo In file included from /usr/include/string.h:535, from silk/SigProc_FIX.h:40, from silk/main.h:31, from silk/stereo_LR_to_MS.c:32: In function 'memcpy', inlined from 'silk_stereo_LR_to_MS' at silk/stereo_LR_to_MS.c:74:5: /usr/include/riscv64-linux-gnu/bits/string_fortified.h:29:10: warning: 'builtin_memcpy' reading 4 bytes from a region of size 0 [-Wstringop-overread] 29 | return _builtinmemcpy_chk (dest, src, len, | ^~~~~~~~~~~~~ 30 | glibc_objsize0 (__dest)); | ~~~~~~ In file included from silk/stereo_LR_to_MS.c:33: silk/stereo_LR_to_MS.c: In function 'silk_stereo_LR_to_MS': silk/stereo_LR_to_MS.c:61:12: note: at offset [-4294967296, -4] into source object 'side' of size [0, 9223372036854775807] 61 | ALLOC( side, frame_length + 2, opus_int16 ); | ^~~~ ./celt/stack_alloc.h:94:37: note: in definition of macro 'ALLOC' 94 | #define ALLOC(var, size, type) type var[size] | ^~~

...

opusenc -V opusenc opus-tools 0.2 (using libopus unknown) Copyright (C) 2008-2018 Xiph.Org Foundation

christiancelona commented 4 months ago

The previous test produced a warning and now opusenc reports the libopus version as unknown. In the next test I will try to recompile the version indicated on the opus site which is actually different from github and which had not caused any problems. I'll find out when I'm done if I'm stupid enough to have wasted all this time for nothing (or to find out that the problem is the unofficial Debian distribution).

christiancelona commented 4 months ago

───────────────────────────────────────────────────── DietPi v9.2.1 : 21 APT updates available ─────────────────────────────────────────────────────

uname -a Linux DietPi 6.1.81 #1 SMP Sat Mar 9 21:40:38 UTC 2024 riscv64 GNU/Linux

wget https://downloads.xiph.org/releases/opus/opus-1.5.1.tar.gz tar -xf opus-1.5.1.tar.gz cd opus-1.5.1 ./configure

Compiler support:

  C99 var arrays: ................ yes
  C99 lrintf: .................... yes
  Use alloca: .................... no (using var arrays)

General configuration:

  Floating point support: ........ yes
  Fast float approximations: ..... no
  Fixed point debugging: ......... no
  Inline Assembly Optimizations: . No inline ASM for your platform, please send patches
  External Assembly Optimizations: 
  Intrinsics Optimizations: ...... no
  Run-time CPU detection: ........ no
  Custom modes: .................. no
  Assertion checking: ............ no
  Hardening: ..................... yes
  Fuzzing: ....................... no
  Check ASM: ..................... no

  API documentation: ............. yes
  Extra programs: ................ yes

sudo make install

...

CC silk/stereo_LR_to_MS.lo In file included from /usr/include/string.h:535, from silk/SigProc_FIX.h:40, from silk/main.h:31, from silk/stereo_LR_to_MS.c:32: In function ‘memcpy’, inlined from ‘silk_stereo_LR_to_MS’ at silk/stereo_LR_to_MS.c:74:5: /usr/include/riscv64-linux-gnu/bits/string_fortified.h:29:10: warning: ‘builtin_memcpy’ reading 4 bytes from a region of size 0 [-Wstringop-overread] 29 | return _builtinmemcpy_chk (dest, src, len, | ^~~~~~~~~~~~~ 30 | glibc_objsize0 (__dest)); | ~~~~~~ In file included from silk/stereo_LR_to_MS.c:33: silk/stereo_LR_to_MS.c: In function ‘silk_stereo_LR_to_MS’: silk/stereo_LR_to_MS.c:61:12: note: at offset [-4294967296, -4] into source object ‘side’ of size [0, 9223372036854775807] 61 | ALLOC( side, frame_length + 2, opus_int16 ); | ^~~~ ./celt/stack_alloc.h:94:37: note: in definition of macro ‘ALLOC’ 94 | #define ALLOC(var, size, type) type var[size] | ^~~

...

opusenc -V opusenc opus-tools 0.2 (using libopus 1.5.1) Copyright (C) 2008-2018 Xiph.Org Foundation

christiancelona commented 4 months ago

Conclusion: I hope I haven't abused your time, the compilation on RISC-V is successful despite the warnings even with DietPi, with the code indicated on the opus site the version is reported correctly, with the main from github it is unknown.

The fact remains that with Raspberry Pi and other ARM SBCs the library in use remains the previous one even after the "make install".

Now a true masochist would install Debian as provided by StarFive, but the point is that on RISC-V it also works with DietPi, on Raspberry it is not updated even with the official distribution and at this point I will write on their forum.

christiancelona commented 4 months ago

First, is your opusenc even dynamically linked. Second, did you check where it's looking for libopus? If it's linking with /usr/lib/libopus.so.x and you're installing to /usr/local/libopus.so.x then it's not going to help. The dynamic linker isn't magic.

ls /usr/local/lib/ libopus.a libopus.so libopus.so.0.10.0 python3.11 libopus.la libopus.so.0 pkgconfig

It is not present in /usr/lib/.

ldd /usr/bin/opusenc linux-vdso.so.1 (0x0000003fa3b11000) libopusenc.so.0 => /lib/riscv64-linux-gnu/libopusenc.so.0 (0x0000003fa3aee000) libopus.so.0 => /usr/local/lib/libopus.so.0 (0x0000003fa3aa1000) libFLAC.so.12 => /lib/riscv64-linux-gnu/libFLAC.so.12 (0x0000003fa3a57000) libm.so.6 => /lib/riscv64-linux-gnu/libm.so.6 (0x0000003fa39ea000) libc.so.6 => /lib/riscv64-linux-gnu/libc.so.6 (0x0000003fa38b8000) /lib/ld-linux-riscv64-lp64d.so.1 (0x0000003fa3b13000) libogg.so.0 => /lib/riscv64-linux-gnu/libogg.so.0 (0x0000003fa38af000)

ericoporto commented 4 months ago

Also, make sure you're using the latest main (and not 1.5.1) since there's been many build fixes there already

Any chance of a new tagged release that would contain such improvements?

christiancelona commented 3 months ago

If you need to create libopus you can do this (opus-tools 0.2 report using libopus unknown):

wget https://github.com/xiph/opus/archive/refs/heads/main.zip
unzip main.zip
cd opus-main/
/autogen.sh
./configure
sudo make install

or, like before (from: https://opus-codec.org/):

wget https://downloads.xiph.org/releases/opus/opus-1.5.1.tar.gz
tar -xf opus-1.5.1.tar.gz
cd opus-1.5.1
./configure
sudo make install

Both work with Debian GNU/Linux 12 (bookworm) x86_64