tensorflow / tflite-micro

Infrastructure to enable deployment of ML models to low-power resource-constrained embedded targets (including microcontrollers and digital signal processors).
Apache License 2.0
1.9k stars 814 forks source link

TFLite Micro wrong Input/Output types on Xtensa hifi3 #1975

Closed oniigiirii closed 1 year ago

oniigiirii commented 1 year ago

Hello

I am trying to run tflite micro on a development board running an Xtensa hifi3 dsp. I compiled tflite micro with xtensa optimized kernels using my Xtensa Compiler with the following make command:

make -f tensorflow/lite/micro/tools/make/Makefile TARGET=xtensa TARGET_ARCH=hifi4 OPTIMIZED_KERNEL_DIR=xtensa XTENSA_TOOLS_VERSION=RG-2018.9-linux

For testing purposes I trained a very simple model that approximates sine values in the range [0, 2*pi]. Due to the fact that the dsp I am using does not support floating point operations I quantized the model using the tflite converter such that it only uses int8 types for all computations.

Running the tflite analyzer on my model using tf.lite.experimental.Analyzer.analyze(q_sine_model) yields the following:

=== TFLite ModelAnalyzer ===

Your TFLite model has '1' subgraph(s). In the subgraph description below,
T# represents the Tensor numbers. For example, in Subgraph#0, the FULLY_CONNECTED op takes
tensor #0 and tensor #6 and tensor #5 as input and produces tensor #7 as output.

Subgraph#0 main(T#0) -> [T#9]
  Op#0 FULLY_CONNECTED(T#0, T#6, T#5[-5175, 0, 0, 0, 0, ...]) -> [T#7]
  Op#1 FULLY_CONNECTED(T#7, T#4, T#3[-638, 9177, -10399, 0, 6489, ...]) -> [T#8]
  Op#2 FULLY_CONNECTED(T#8, T#2, T#1[-4285]) -> [T#9]

Tensors of Subgraph#0
  T#0(serving_default_input_1:0) shape_signature:[-1, 1], type:INT8
  T#1(sequential/dense_2/BiasAdd/ReadVariableOp) shape:[1], type:INT32 RO 4 bytes, buffer: 2, data:[-4285]
  T#2(sequential/dense_2/MatMul) shape:[1, 16], type:INT8 RO 16 bytes, buffer: 3, data:[., ., ., ., g, ...]
  T#3(sequential/dense_1/BiasAdd/ReadVariableOp) shape:[16], type:INT32 RO 64 bytes, buffer: 4, data:[-638, 9177, -10399, 0, 6489, ...]
  T#4(sequential/dense_1/MatMul) shape:[16, 16], type:INT8 RO 256 bytes, buffer: 5, data:[., %, ., ., ., ...]
  T#5(sequential/dense/BiasAdd/ReadVariableOp) shape:[16], type:INT32 RO 64 bytes, buffer: 6, data:[-5175, 0, 0, 0, 0, ...]
  T#6(sequential/dense/MatMul) shape:[16, 1], type:INT8 RO 16 bytes, buffer: 7, data:[., ., ., ., ., ...]
  T#7(sequential/dense/MatMul;sequential/re_lu/Relu;sequential/dense/BiasAdd) shape_signature:[-1, 16], type:INT8
  T#8(sequential/dense_1/MatMul;sequential/re_lu_1/Relu;sequential/dense_1/BiasAdd) shape_signature:[-1, 16], type:INT8
  T#9(StatefulPartitionedCall:0) shape_signature:[-1, 1], type:INT8

---------------------------------------------------------------
Your TFLite model has '1' signature_def(s).

Signature#0 key: 'serving_default'
- Subgraph: Subgraph#0
- Inputs: 
    'input_1' : T#0
- Outputs: 
    'dense_2' : T#9

---------------------------------------------------------------
              Model size:       2648 bytes
    Non-data buffer size:       2124 bytes (80.21 %)
  Total data buffer size:        524 bytes (19.79 %)
    (Zero value buffers):          0 bytes (00.00 %)

indicating that the quantized model does indeed expect int8 inputs and outputs.

I then used the unix tool xxd to produce the attached tflite micro model (as a header file) model.h

When I run the model on my microcontroller however, I get results that do not match the quantized tflite model. Furthermore inspecting interpreter->input(0)->type shows that it is equal to kTFLiteFloat32 and not kTFLiteInt8 as would be expected.

Can someone explain why this is happening? Thanks in advance!

oniigiirii commented 1 year ago

It was a mistake on my part. I built the static library using the -DTF_LITE_STATIC_MEMORY flag but forgot to include the flag during the building of my project. As a result, the wrong TfLiteTensor struct was being used at runtime. Which caused the outputs and types to be completely false.