tensorflow / tflite-micro

Infrastructure to enable deployment of ML models to low-power resource-constrained embedded targets (including microcontrollers and digital signal processors).
Apache License 2.0
1.89k stars 814 forks source link

src/third_party/cmsis_nn/Include/arm_nnsupportfunctions.h:847: undefined reference to `__sxtb16 #2331

Closed Gostas closed 11 months ago

Gostas commented 11 months ago

Hi,

I am building the tflite-micro-arduino-examples for the Nano 33 BLE. I have updated the source files to use the latest version from this repo and also updated to the examples (fork with updated scripts and examples https://github.com/Gostas/tflite-micro-arduino-examples).

I would get the following error for 'hello_world', which I fixed by substituting SXTB16 for __sxtb16:

/tmp/build/third_party/cmsis_nn/Source/NNSupportFunctions/objs.a(arm_nn_vec_mat_mult_t_s4.c.o): In function `read_and_pad_s4':
/home/tflite/src/third_party/cmsis_nn/Include/arm_nnsupportfunctions.h:847: undefined reference to `__sxtb16'
/home/tflite/src/third_party/cmsis_nn/Include/arm_nnsupportfunctions.h:848: undefined reference to `__sxtb16'

/tmp/build/third_party/cmsis_nn/Source/NNSupportFunctions/objs.a(arm_nn_vec_mat_mult_t_s4.c.o):
/home/tflite/src/third_party/cmsis_nn/Include/arm_nnsupportfunctions.h:848: more undefined references to `__sxtb16' follow
collect2: error: ld returned 1 exit status

This example would pass compilation, but in the next one, 'micro_speech', I get

/tmp/build/third_party/cmsis_nn/Source/ConvolutionFunctions/objs.a(arm_depthwise_conv_3x3_s8.c.o): In function `arm_depthwise_conv_3x3_s8':
/home/tflite/src/third_party/cmsis_nn/Source/ConvolutionFunctions/arm_depthwise_conv_3x3_s8.c:137: undefined reference to `__smlabb'
/home/tflite/src/third_party/cmsis_nn/Source/ConvolutionFunctions/arm_depthwise_conv_3x3_s8.c:139: undefined reference to `__smlatt'
(more occurences)
collect2: error: ld returned 1 exit status

Am I missing any files? Or is the Mbed CMSIS library missing some files?

I am using the arduino:mbed_nano 4.0.10 core and compiling with -mcpu=cortex-m4 -mfloat-abi=softfp -mfpu=fpv4-sp-d16 -mthumb

Thanks

mansnils commented 11 months ago

I looks like https://github.com/tensorflow/tflite-micro-arduino-examples (at least src/third_party/cmsis_nn/), has not been updated in a long while. Are you able to run your example if you don't update it? Not sure how to reproduce this but perhaps it is related to https://github.com/ARM-software/CMSIS-NN/pull/39

Gostas commented 11 months ago

I'm using the latest version of CMSIS_NN, so src/third_party/cmsis_nn/Include/Internal/arm_nn_compiler.h contains the necessary macros and definitions:

// ACLE intrinsics under groups __ARM_FEATURE_QBIT, __ARM_FEATURE_DSP , __ARM_FEATURE_SAT, __ARM_FEATURE_SIMD32

// Note: Just __ARM_FEATURE_DSP is checked to collect all intrinsics from the above mentioned groups

#if (defined(__ARM_FEATURE_DSP) && (__ARM_FEATURE_DSP == 1))

    // Common intrinsics
    #define SMLABB __smlabb
    #define SMLATT __smlatt
    #define QADD __qadd
    #define QSUB8 __qsub8
    #define QSUB16 __qsub16
    #define SADD16 __sadd16

    // Compiler specifc variants of intrinsics. Create a new section or file for IAR if needed
    #if defined(__ARMCC_VERSION) && (__ARMCC_VERSION >= 6010050) || defined(__ICCARM__)

        #define SMULBB __smulbb
        #define SMULTT __smultt
        #define ROR __ror
        #define SXTB16 __sxtb16
        #define SXTAB16 __sxtab16
        #define SXTB16_RORn(ARG1, ARG2) SXTB16(ROR(ARG1, ARG2))
        #define SXTAB16_RORn(ARG1, ARG2, ARG3) SXTAB16(ARG1, ROR(ARG2, ARG3))
        #define SMLAD __smlad

...

__STATIC_FORCEINLINE uint32_t SXTB16(uint32_t op1)
{
    uint32_t result;

    __ASM("sxtb16 %0, %1" : "=r"(result) : "r"(op1));
    return (result);
}

and src/third_party/cmsis_nn/Include/arm_nnsupportfunctions.h includes them

#include "third_party/cmsis_nn/Include/Internal/arm_nn_compiler.h"
#include "third_party/cmsis_nn/Include/arm_nn_math_types.h"
#include "third_party/cmsis_nn/Include/arm_nn_types.h"

I will try using the base arduino_examples files. I saw that the arduino version of arm_depthwise_conv_3x3_s8.c does not have any of the DSP-enabled code (strange choice as the M4 supports DSP instructions like SMLABB and SMLATT).

If you want to reproduce, you can create an arduino library by running scripts/sync_from_tflite_micro.sh from the repo I've linked above. I've made several changes to the base scripts, since they are old. Most importantly I updated the MANIFEST.ini file, updated source code for the hello_world example, removed the "magic_wand" argument to the python scripts and moved the signal/ dir under src/ in the final library.

ddavis-2015 commented 11 months ago

@Gostas @mansnils

Just adding my two-cents here based on my hazy recollection of March 2023:

  1. The downloaded version of CMSIS and CMSIS_NN changed after Feb. 27 2023
  2. After Feb. 27 2023, the intrinsics that were defined in the CMSIS header files no longer included all intrinsics used by the arduino-example repo. code base.
  3. The Arduino supplied compiler for the NRF52840 (Nano 33) does not supply these missing intrinsics.
  4. The arduino-examples repo was frozen at this point.

Note: The arduino-examples repo. was frozen in a working state, such that the Nano 33 BLE compiles would succeed.

Gostas commented 11 months ago

Thanks for your response @ddavis-2015 .

I tried compiling with a newer compiler, gcc-arm-none-eabi-10.3-2021.10, but same results.

It looks like it's the same issue as https://github.com/tensorflow/tensorflow/issues/48741.

For now I'll just use the arduino-examples library as-is.

Appreciate your help guys.

Gostas commented 10 months ago

For reference, I was able to fix the error above by including in the Tflm library the version of CMSIS that comes along Mbed OS for the Arduino Mbed Nano boards (.arduino15/packages/arduino/hardware/mbed_nano/4.0.10/cores/arduino/mbed/cmsis/CMSIS_5) AND using gcc-arm-none-eabi-10.3-2021.10.

All the examples compile and work correctly, with minor modifications that reflect the updates in the library.

I've also included a modified version of the makefile and supporting files to generate a static Arduino library.

You can view my fork here: https://github.com/Gostas/tflite-micro-arduino-examples .

I guess there wouldn't be any interest in updating the Arduino library, since a newer version of the toolchain is needed?