android / ndk

The Android Native Development Kit
1.99k stars 257 forks source link

[BUG] neon intrinsics fail to compile #1657

Closed zchrissirhcz closed 2 years ago

zchrissirhcz commented 2 years ago

Description

The following code failed to compile when the template argument nc=4:

#include <stdint.h>
#include <arm_neon.h>

template<int32_t nc>
void test()
{
    float32x4_t tcurr = {1.f, 2.f, 3.f, 4.f};
    float32x4_t tnext = {5.f, 6.f, 7.f, 8.f};
    float32x4_t t_right;
    if (nc == 4) {
        t_right = tnext;
    } else {
        t_right = vextq_f32(tcurr, tnext, nc);
    }
}

int main()
{
    test<1>(); // ok
    test<3>(); // ok
    test<4>(); // compile error: argument value 4 is outside the valid range [0, 3]

    return 0;
}

compile script

#!/bin/bash

ANDROID_NDK=~/soft/android-ndk-r24-beta2
TOOLCHAIN=$ANDROID_NDK/build/cmake/android.toolchain.cmake

BUILD_DIR=android-arm64
mkdir -p $BUILD_DIR
cd $BUILD_DIR

cmake -G Ninja \
    -DCMAKE_TOOLCHAIN_FILE=$TOOLCHAIN \
    -DANDROID_LD=lld \
    -DANDROID_ABI="arm64-v8a" \
    -DANDROID_PLATFORM=android-24 \
    -DCMAKE_BUILD_TYPE=Release \
    ../..

#ninja
#cmake --build . --verbose
cmake --build .

cd ..

Error message

(base) zz@arcsoft-43% ./android-arm64-build.sh 
-- Android: Targeting API '24' with architecture 'arm64', ABI 'arm64-v8a', and processor 'aarch64'
-- Android: Selected unified Clang toolchain
-- The C compiler identification is Clang 14.0.0
-- The CXX compiler identification is Clang 14.0.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/zz/soft/android-ndk-r24-beta2/toolchains/llvm/prebuilt/linux-x86_64/bin/clang - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/zz/soft/android-ndk-r24-beta2/toolchains/llvm/prebuilt/linux-x86_64/bin/clang++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
CMake Warning:
  Manually-specified variables were not used by the project:

    ANDROID_LD

-- Build files have been written to: /home/zz/work/test/ndk_bug/build/android-arm64
[1/2] Building CXX object CMakeFiles/testbed.dir/main.cpp.o
FAILED: CMakeFiles/testbed.dir/main.cpp.o 
/home/zz/soft/android-ndk-r24-beta2/toolchains/llvm/prebuilt/linux-x86_64/bin/clang++ --target=aarch64-none-linux-android24 --sysroot=/home/zz/soft/android-ndk-r24-beta2/toolchains/llvm/prebuilt/linux-x86_64/sysroot   -DANDROID -fdata-sections -ffunction-sections -funwind-tables -fstack-protector-strong -no-canonical-prefixes -D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fexceptions -frtti -stdlib=libc++ -O3 -DNDEBUG -fPIE -std=gnu++11 -MD -MT CMakeFiles/testbed.dir/main.cpp.o -MF CMakeFiles/testbed.dir/main.cpp.o.d -o CMakeFiles/testbed.dir/main.cpp.o -c /home/zz/work/test/ndk_bug/main.cpp
/home/zz/work/test/ndk_bug/main.cpp:13:19: error: argument value 4 is outside the valid range [0, 3]
        t_right = vextq_f32(tcurr, tnext, nc);
                  ^                       ~~
/home/zz/soft/android-ndk-r24-beta2/toolchains/llvm/prebuilt/linux-x86_64/lib64/clang/14.0.0/include/arm_neon.h:7103:25: note: expanded from macro 'vextq_f32'
  __ret = (float32x4_t) __builtin_neon_vextq_v((int8x16_t)__s0, (int8x16_t)__s1, __p2, 41); \
                        ^                                                        ~~~~
/home/zz/work/test/ndk_bug/main.cpp:21:5: note: in instantiation of function template specialization 'test<4>' requested here
    test<4>(); // compile error: argument value 4 is outside the valid range [0, 3]
    ^
1 error generated.
ninja: build stopped: subcommand failed.

Affected versions

r23, r24

Canary version

No response

Host OS

Linux

Host OS version

Ubuntu 20.04

Affected ABIs

arm64-v8a

Build system

CMake

Other build system

No response

minSdkVersion

24

Device API level

Not related to device API level.

zchrissirhcz commented 2 years ago

Tried the following Android NDK versions, all failed:

Over17 commented 2 years ago

Your template specialization with <4> leads to this code emitted:

t_right = vextq_f32(tcurr, tnext, 4);

which is illegal (last f32 index in a 128-bit vector is 3) and leads to the above-mentioned error. The index range verification happens before the dead code elimination is kicked in (later on in opt stage).

Based on the above, looks like intended behaviour to me.

DanAlbert commented 2 years ago

Makes sense to me. Thanks for the diagnosis :)