void-linux / void-packages

The Void source packages collection
https://voidlinux.org
Other
2.59k stars 2.16k forks source link

Direwolf on Rpi Zero W fails to run due to missing NEON support. #48299

Closed CtrlC-Root closed 10 months ago

CtrlC-Root commented 10 months ago

Is this a new report?

Yes

System Info

Void 6.1.69_1 armv6l Unknown uptodate rFF

Package(s) Affected

direwolf-1.7_1

Does a report exist for this bug with the project's home (upstream) and/or another distro?

https://github.com/RPi-Distro/repo/issues/278 https://bugs.launchpad.net/raspbian/+bug/1980899 https://groups.io/g/direwolf/topic/87406630#5856

Expected behaviour

You should be able to run the binaries in the direwolf package on supported ARM boards as per the Void Handbook including the Raspberry Pi Zero W.

Actual behaviour

The direwolf binary crashes with an Illegal instruction as seen below:

[root@void-live ~]# direwolf
Dire Wolf version 1.7
Includes optional support for:  hamlib cm108-ptt

Dire Wolf requires only privileges available to ordinary users.                                                                                                                              
Running this as root is an unnecessary security risk.

Illegal instruction
[root@void-live ~]#

I've done some digging and it appears this is likely because it's compiled on an ARM system that has the NEON instruction set whereas the Raspberry Pi Zero W does not support this.

[root@void-live ~]# readelf -A /usr/bin/direwolf | grep NEON
  Tag_Advanced_SIMD_arch: NEONv1
[root@void-live ~]# lscpu
Architecture:           armv6l
  Byte Order:           Little Endian
CPU(s):                 1
  On-line CPU(s) list:  0
Vendor ID:              ARM
  Model name:           ARM1176
    Model:              7
    Thread(s) per core: 1
    Core(s) per socket: 1
    Socket(s):          1
    Stepping:           r0p7
    CPU(s) scaling MHz: 100%
    CPU max MHz:        1000.0000
    CPU min MHz:        700.0000
    BogoMIPS:           997.08
    Flags:              half thumb fastmult vfp edsp java tls

Of course just because the binary contains NEON instructions does not necessarily mean it's executing them but I suspect this is the issue given the linked bug reports above. I am currently attempting to build this package from source on the board to see if that solves it but that's going to take a while.

Steps to reproduce

  1. Follow the Void Handbook instructions to download an ARMv6l image (I used void-rpi-armv6l-20230628.img) and prepare a microSD card using it.
  2. Boot the image on an ARMv6l board without support for NEON instructions (I used a Raspberry Pi Zero W).
  3. Install the direwolf package: xbps-install direwolf
  4. Run the direwolf binary in a way where it attempts to actually process data: direwolf -c /usr/share/doc/direwolf/conf/direwolf.conf
  5. Observe the crash documented above in the Actual behavior section.

Tagging @classabbyamp as the package maintainer.

classabbyamp commented 10 months ago

looks like the build-time detection is broken, should be fixed in https://github.com/void-linux/void-packages/commit/f10e78710bea56c60b89a4571372ce95e3b4d07e

CtrlC-Root commented 10 months ago

Thank you for getting to this so quickly! Unfortunately I don't believe this fixes the issue. I installed the updated package and I still see the same error as before and I still see the NEONv1 flag in the binary readelf output. I looked at your commit and the build scripts for direwolf and I think I understand why. If you look at the section of the FindCPUflags.cmake file you are editing with vsed you'll see that while HAS_NEON is now forced to OFF it's still configuring the preprocessor and compiler to use NEON: https://github.com/wb2osz/direwolf/blob/master/cmake/modules/FindCPUflags.cmake#L366-L370.

I can confirm that this is the issue though because my source build on the Raspberry Pi Zero finally finished and that one works correctly. I can no longer see the NEON flag in the readelf output:

[alex@digirig void-packages]$ readelf -A /usr/bin/direwolf 
Attribute Section: aeabi
File Attributes
  Tag_CPU_name: "6"
  Tag_CPU_arch: v6
  Tag_ARM_ISA_use: Yes
  Tag_THUMB_ISA_use: Thumb-1
  Tag_FP_arch: VFPv2
  Tag_ABI_PCS_wchar_t: 4
  Tag_ABI_FP_rounding: Needed
  Tag_ABI_FP_denormal: Needed
  Tag_ABI_FP_exceptions: Needed
  Tag_ABI_FP_number_model: IEEE 754
  Tag_ABI_align_needed: 8-byte
  Tag_ABI_align_preserved: 8-byte, except leaf SP
  Tag_ABI_enum_size: int
  Tag_ABI_VFP_args: VFP registers
  Tag_CPU_unaligned_access: v6

I spent quite a bit of time trying to find a sed command to fix this but ultimately gave up. Is there a way to conditionally apply a patch to a package based on the architecture? If so this should do the trick:

[... ]$ cat srcpkgs/direwolf/patches/disable_arm_neon.patch
disable ARM NEON instructions

--- a/cmake/modules/FindCPUflags.cmake
+++ b/cmake/modules/FindCPUflags.cmake
@@ -350,27 +350,7 @@ else ()
        set(HAS_AVX512 OFF CACHE BOOL "Architecture does not have AVX512 SIMD enabled")
     endif()
 elseif(ARCHITECTURE_ARM)
-    if(C_MSVC)
-        try_run(RUN_NEON COMPILE_NEON "${CMAKE_BINARY_DIR}/tmp" "${TEST_DIR}/test_arm_neon.cxx" COMPILE_DEFINITIONS /O0)
-    else()
-        if(${CMAKE_HOST_SYSTEM_PROCESSOR} STREQUAL ${CMAKE_SYSTEM_PROCESSOR})
-          try_run(RUN_NEON COMPILE_NEON "${CMAKE_BINARY_DIR}/tmp" "${TEST_DIR}/test_arm_neon.cxx" COMPILE_DEFINITIONS -mfpu=neon -O0)
-        else()  
-          try_compile(COMPILE_NEON "${CMAKE_BINARY_DIR}/tmp" "${TEST_DIR}/test_arm_neon.cxx" COMPILE_DEFINITIONS -mfpu=neon -O0)
-          set(RUN_NEON  0)
-        endif()
-    endif()
-    if(COMPILE_NEON AND RUN_NEON EQUAL 0)
-       set(HAS_NEON ON CACHE BOOL "Architecture has NEON SIMD enabled")
-       message(STATUS "Use NEON SIMD instructions")
-       if(C_GCC OR C_CLANG)
-           set( CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -mfpu=neon" )
-           set( CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mfpu=neon" )
-           add_definitions(-DUSE_NEON)
-       endif()
-    else()
-       set(HAS_NEON OFF CACHE BOOL "Architecture does not have NEON SIMD enabled")
-    endif()
+   set(HAS_NEON OFF CACHE BOOL "Architecture does not have NEON SIMD enabled")
 elseif(ARCHITECTURE_ARM64)
     # Advanced SIMD (aka NEON) is mandatory for AArch64
     set(HAS_NEON ON CACHE BOOL "Architecture has NEON SIMD enabled")

Cross compiling with the patch above using ./xbps-src -a armv6l build direwolf I see that NEON has been disabled:

[... ]$ readelf -A masterdir/builddir/direwolf-1.7/build/src/direwolf
Attribute Section: aeabi
File Attributes
  Tag_CPU_name: "6"
  Tag_CPU_arch: v6
  Tag_ARM_ISA_use: Yes
  Tag_THUMB_ISA_use: Thumb-1
  Tag_FP_arch: VFPv2
  Tag_ABI_PCS_wchar_t: 4
  Tag_ABI_FP_rounding: Needed
  Tag_ABI_FP_denormal: Needed
  Tag_ABI_FP_exceptions: Needed
  Tag_ABI_FP_number_model: IEEE 754
  Tag_ABI_align_needed: 8-byte
  Tag_ABI_align_preserved: 8-byte, except leaf SP
  Tag_ABI_enum_size: int
  Tag_ABI_VFP_args: VFP registers
  Tag_CPU_unaligned_access: v6