Closed mniestroj closed 2 months ago
FYI @keith-packard @cfriedt @aescolar
It might be possible to fix the issue in native_sim by setting the default cpu to a "dummy" cpu at build time to avoid the null pointer dereference.
Would that solve the issue for you @mniestroj ?
If you mean:
modified kernel/init.c
@@ -44,7 +44,9 @@ BUILD_ASSERT(CONFIG_MP_NUM_CPUS == CONFIG_MP_MAX_NUM_CPUS,
/* the only struct z_kernel instance */
__pinned_bss
-struct z_kernel _kernel;
+struct z_kernel _kernel = {
+ .cpus[0].current = &_thread_dummy,
+};
#ifdef CONFIG_PM
__pinned_bss atomic_t _cpus_active;
then yes, it helps to fix/workaround the crash.
@mniestroj I pressume that if you pass -DEXTRA_CPPFLAGS=-fno-stack-protector
in your Arch build the issue goes away, correct? (I don't have Arch here to test)
@mniestroj I pressume that if you pass
-DEXTRA_CPPFLAGS=-fno-stack-protector
in your Arch build the issue goes away, correct? (I don't have Arch here to test)
@aescolar Yes, that is right.
@aescolar Yes, that is right.
Thanks. The issue would seem to be that we are getting the picolibC/embedded stack protector code being used early in the boot process. Which I very much thing we should not. A quick "fix" may be to just say that we do not support enabling the stack-protector feature when building with picolibc for the posix arch, which I doubt many will miss. That could mean an if in the posix arch cmake file to set -fno-stack-protector if the compiler supports it. Anyhow, I'm looking into this. Thanks for the very detailed report :)
Describe the bug Combining
CONFIG_POSIX_C_LIB_EXT=y
and-fstack-protector
results innative_sim
platform to crash during boot. For distributions that have -fstack-protectoras default (e.g. Arch Linux) it results in many integration tests and sample apps in Zephyr to fail (e.g.
west twister` does not pass).To Reproduce Steps to reproduce the behavior:
west build -p -b native_sim zephyr/tests/posix/getentropy/ -T portability.posix.getentropy.picolibc -- -DEXTRA_CPPFLAGS=-fstack-protector
-fstack-protector
is disabled by default on Ubuntu / https://hub.docker.com/r/zephyrprojectrtos/ci CI image-fstack-protector
is enabled by default on Arch Linux, so passing-DEXTRA_CPPFLAGS=-fstack-protector
is not needed to reproduce this issuewest build -t run
Segmentation fault (core dumped)
error occurs due to NULL pointer dereference__stack_chk_init
is called automatically by host libc, before even Zephyr init kicks in (since it is marked as__attribute__((__constructor__))
getentropy()
is called, which is provided by Zephyr in caseCONFIG_POSIX_C_LIB_EXT=y
device_is_ready(entropy) == false
, soerrno = EIO
assignment is attemptederrno = ...
is actually_kernel.cpus[0].current.errno_var = ...
in that case; this results with segmentation fault since_kernel.cpus[0].current == 0x0
at this stageExpected behavior Running
west build -p -b native_sim zephyr/tests/posix/getentropy/ -T portability.posix.getentropy.picolibc -t run
should pass, even on systems where-fstack-protector
is the default (like on Arch Linux).Impact Applications that use
CONFIG_POSIX_C_LIB_EXT=y
(e.g. using thread-safegetopt()
shell extension, crc shell command,fnmatch()
function) fail to work withnative_sim
. Also "standard" Zephyr integration tests fail on Linux distributions with-fstack-protector
being in the default host compiler flags.Logs and console output
GDB session logs
``` Breakpoint 1, __stack_chk_init () at /modules/lib/picolibc/newlib/libc/ssp/stack_protector.c:23 23 if (__stack_chk_guard != 0) (gdb) bt #0 __stack_chk_init () at /modules/lib/picolibc/newlib/libc/ssp/stack_protector.c:23 #1 0xf7c52bc7 in call_init (argc=1, argv=0xffffa044, env=`gcc -v` (Zephyr CI Docker v0.26.13)
``` Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 11.4.0-1ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=2 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04) ````gcc -v` (Arch Linux)
``` Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/14.1.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /build/gcc/src/gcc/configure --enable-languages=ada,c,c++,d,fortran,go,lto,m2,objc,obj-c++,rust --enable-bootstrap --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://gitlab.archlinux.org/archlinux/packaging/packages/gcc/-/issues --with-build-config=bootstrap-lto --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-werror Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.1.1 20240522 (GCC) ```Environment (please complete the following information):
-fstack-protector
flag3.7.0-rc1
Additional context Noticed that issue with downstream applications recently, after bumping to
3.7.0-rc1
.git bisect
showed 2bc722a97ed6 ("posix: deprecate POSIX_FNMATCH GETOPT GETENTROPY") as the first failing commit. The reason was thatCONFIG_POSIX_C_LIB_EXT
is now enabled for more applications thanCONFIG_GETENTROPY
was. As a result downstream app withgetopt()
use is now failing to boot withnative_sim
, due togetentropy()
being implemented by Zephyr POSIX subsystem (which wasn't the case before, sinceCONFIG_GETENTROPY=n
before Zephyr version bump).