Failed to apply EXTERNAL delegate or Failed to apply XNNPACK delegate.

jlamperez commented 1 year ago

Hi,

I have compiled a custom mobilenet model to TFLite and when I tried to benchmark the model I am getting the next errors.

 ./benchmark_model --graph=mobilenet.tflite --num_threads=6 --use_xnnpack=true
STARTING!
Log parameter values verbosely: [0]
Num threads: [6]
Graph: [mobilenet.tflite]
#threads used for CPU inference: [6]
Use xnnpack: [1]
Loaded model mobilenet.tflite
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
XNNPACK delegate created.
Failed to apply XNNPACK delegate.
Benchmarking failed.

or

LD_LIBRARY_PATH=/mnt/sd/aarch64_build ./benchmark_model \
--graph=mobilenet.tflite --external_delegate_path="/mnt/sd/aarch64_build/delegate/libarmnnDelegate.so" \
--external_delegate_options="backends:CpuAcc" \
--num_threads=6 
STARTING!
Log parameter values verbosely: [0]
Num threads: [6]
Graph: [mobilenet.tflite]
#threads used for CPU inference: [6]
External delegate path: [/mnt/sd/aarch64_build/delegate/libarmnnDelegate.so]
External delegate options: [backends:CpuAcc]
Loaded model mobilenet.tflite
INFO: TfLiteArmnnDelegate: Created TfLite ArmNN delegate.
EXTERNAL delegate created.
Failed to apply EXTERNAL delegate.
Benchmarking failed.

When using ./benchmark_model --graph=mobilenet.tflite --num_threads=6 --use_xnnpack=false everything works well and I can launch the benchmark.

I am seeing the benchmark_tflite_model.cc code and the problem seems to be in interpreter_->ModifyGraphWithDelegate.

How can I debug this problem or know more about it and why is happening?

Thank you and regards.

matthewsloyanARM commented 1 year ago

Hi @jlamperez,

Thank you for reaching out. Your commands look fine. Issues in ModifyGraphWithDelegate are quite vague but they can be related to the model as seen here: https://github.com/tensorflow/tensorflow/issues/56166. Have you possibly tried any other models to eliminate this?

I don't think there is a way to enable more useful output when using the benchmark_model, but you should be able to run it in GDB. Also, if you would you be able to share you model, I can try and reproduce this for you. Thank you!

Kind regards,

Matthew

jlamperez commented 1 year ago

Hi @matthewsloyanARM thank you for your answer,

I have seen that for the first case when --use_xnnpack=true tflite is compiled now when using the script setup-armnn.sh

  cmake -DTFLITE_ENABLE_XNNPACK=OFF \
        "$target_arch_cmd" \
        "$TFLITE_SRC"
  cmake --build . -j "$NUM_THREADS"

So xnnpack is not available in tflite..

For the second one, I have added some log statements and saw that my model has some dynamic_tensor_index_.

Is Where and SparseToDense operators supported in tflite delegate? I don't see them in Supported operations. Is this documentation updated?

Thanks!

matthewsloyanARM commented 1 year ago

Hi @jlamperez,

You are correct, thanks for pointing that out. It looks like the XNNPACK delegate will be automatically applied when running the benchmark_model, unless --use_xnnpack=false is specified, which is why it works when added to your command. Hopefully this makes sense?

This following example from the running benchmark_model --help shows this --use_xnnpack=false = explicitly apply the XNNPACK delegate. Note the XNNPACK delegate could be implicitly applied by the TF Lite runtime regardless the value of this parameter. To disable this implicit application, set the value to false explicitly.

Regarding your second query, the issue you are having is likely due to the dynamic tensors as the support is quite basic, it's hard to tell though without seeing your model. Would you know what operators have dynamic tensors in your model?

Lastly, where and SparseToDense are currently not supported in Arm NN. I have added these two operators to our backlog but there is no current estimate on when they will be added.

Kind regards,

Matthew

jlamperez commented 1 year ago

Hi @matthewsloyanARM,

I see for the benchmarking documentation that use_xnnpack: bool (default=false) so when launching ./benchmark_model --graph=model.tflite --num_threads=6 it makes sense to me that xnnpack is not used.

But then if TfLiteArmnnDelegate EXTERNAL delegate fails, why is XNNPACK delegate for CPU created and works the benchmarking if I have compiled tflite cmake with -DTFLITE_ENABLE_XNNPACK=OFF? Shouldn't that fail since I don't have XNNPACK?

I have two models and the dynamic tensors problems are located in this part of the models:

Model 1

model_one

Model 2

model_two

The first one I don't know which can be the problem because I see that Reshape and Shape are supported.

But for the second one, I see the problems in Where and SparseToDense operations.

Thank you for being so supportive! Very appreciate

matthewsloyanARM commented 1 year ago

Hi @jlamperez,

It seems like the benchmark_model will try to fallback to XNNPACK regardless if it is enabled or not with TFLITE_ENABLE_XNNPACK. I think the best thing to do is to always disable it when running the benchmark_model too with --use_xnnpack=false. As I noticed that it would still be created for me unless I explicitly set this flag, even though the default is false. Can you add this to your Arm NN TfLite Delegate command to eliminate this as an issue?

Thank you for supplying the images, it looks like this should be supported as there's nothing that jumps out at me. As you mentioned the Where and SparseToDense aren't supported, but these two operators should fallback to the TfLite Runtime.

Would it be possible for you to try running these models on our own model runner tool called ExecuteNetwork, it might give a better error message and will eliminate any issues with the benchmark_model. It might be built by default when using the build-tool and can be located in build/(debug/release)/armnn/tests/ExecuteNetwork. If it doesn't you should be able to build it by adding this your build-armnn.sh command --armnn-cmake-args='-DBUILD_TESTS=1'. You can then run execute network on using the delegate like this ./ExecuteNetwork --compute CpuAcc --model-path model.tflite. I would also advise to try --compute CpuRef to see if you get any difference.

Lastly, have you tried any other models with the Arm NN TfLite Delegate just to see if they run? Thanks!

Kind regards,

Matthew

matthewsloyanARM commented 1 year ago

Hi @jlamperez,

I am going to close this issue due to inactivity. However, feel free to reopen it anytime if the issue still persists or if you have any questions. Thanks again for getting in touch!

Kind regards,

Matthew

jlamperez commented 1 year ago

Hi @matthewsloyanARM

No problem.

I tried some other models with the Arm NN TfLite Delegate and they worked.

Do you know when Where and SparseToDense are going to be supported?

Thanks!

TeresaARM commented 1 year ago

Hi Jorge,

thank you very much for your feedback.

Those 2 operators are not part of our roadmap at the moment.

Using Arm NN TfLite Delegate, the operators that are not supported in Arm NN should fall back to tflite. Which error are you seeing?

Kind Regards

jlamperez commented 1 year ago

Hi @TeresaARM,

I have two models and in both, I have this error

Attempting to use a delegate that only supports static-sized tensors with a graph that has dynamic-sized tensors

I have converted the models with static-sized tensors but I'm still having the same problems I had in https://github.com/ARM-software/armnn/issues/716#issuecomment-1431695672. Arm NN is not falling back to tflite, which is weird because the model is working for me when I don't use ArmNN delegate.

How can improve the debugging and how should I contribute with the non-supported WHere and SparseToDense layers?

TeresaARM commented 1 year ago

Hi @jlamperez,

thank you for your message, If you are willing to contribute you can create a Gerrit review account at mlplatform.org and make your contribution there. You can use your GitHub credentials when creating your account. The process for contributing to Arm NN is outlined in this Contributor Guide.

Where and SparseToDense layers would not be very straight forward.

SparseToDense is not supported by Compute Library, therefore you would not benefit of any performance uplift in Arm Cpus and Gpus, just by adding the layer to Arm NN.
Where is a special type of layer, we do not support this type of layers yet, but it has been recently added to our 1 year roadmap. In order to avoid using this layer, you can modify the model by unrolling, and repeating the part that is inside the loop.

However, by using the Arm NN TFLite delegate, the not supported layers should fall back to TFLite. Is it possible for you to attach the model in a message in this thread?

Kindest Regards

ARM-software / armnn

Failed to apply EXTERNAL delegate or Failed to apply XNNPACK delegate. #716

Model 1

Model 2