Closed liamsun2019 closed 1 month ago
Are you using TFLite 2.5? @MikeJKelly recently pointed out in my issue (https://github.com/ARM-software/armnn/issues/656) that 2.7 introduces some errors in ArmNN (x86_64 for sure) which need to be resolved.
Big thanks for your quick reply. I tried with tensorflow 2.3 and 2.5, both cases seemed abnormal. One thing I am curious is, I just used prebuilt libraries and pure c++ API to conduct the tests. Is that related to tflite version? You could reproduce the issue with the archives I uploaded. It looks that the produced y and x values are both smaller than expected. Not sure if there're some incorrect computations in the library.
I also conducted similar tests using TfLite delegate way. The inference results were correct. My test codes are below:
import numpy as np import tflite_runtime.interpreter as tflite import cv2 as cv
armnn_delegate = tflite.load_delegate(library="libarmnnDelegate.so", options={"backends": "CpuAcc,GpuAcc,CpuRef", "logging-severity":"info"})
interpreter = tflite.Interpreter(model_path="lite-model_movenet_singlepose_thunder_tflite_int8_4.tflite", experimental_delegates=[armnn_delegate])
interpreter.allocate_tensors() input_details = interpreter.get_input_details()
image = cv.imread("test.jpg")
input_image = cv.cvtColor(image, cv.COLOR_BGR2RGB)
input_image = cv.resize(input_image, dsize=(256, 256))
input_image = input_image.reshape(-1, 256, 256, 3)
interpreter.set_tensor(input_details[0]['index'], input_image.astype(np.uint8)) interpreter.invoke()
output_details = interpreter.get_output_details() keypoints_with_scores = interpreter.get_tensor(output_details[0]['index']) keypoints_with_scores = np.squeeze(keypoints_with_scores) print(keypoints_with_scores)
The results are shown below: 0.1766229 0.6021235 0.92727023 0.16056627 0.6021235 0.95135516 0.16056627 0.5820527 0.88311446 0.16056627 0.5579678 0.81888795 0.16056627 0.5218404 0.81888795 0.2609202 0.57402444 0.92727023 0.24887772 0.46162802 0.92727023 0.3532458 0.5619819 0.6262084 0.36127412 0.33718917 0.97142595 0.47367048 0.57402444 0.81888795 0.47768465 0.4174723 0.92727023 0.5178262 0.57402444 0.95135516 0.5258545 0.4816988 0.92727023 0.61416596 0.7506473 0.95135516 0.71853405 0.40141568 0.92727023 0.8349446 0.7265624 0.95135516 0.7466332 0.20070784 0.92727023
Instead, the results are incorrect when I use c++ APIs to do the same inference with the same input image: 0.132467 0.561982 0.734591 0.116411 0.549939 0.883114 0.120425 0.541911 0.927270 0.112396 0.501770 0.883114 0.108382 0.473670 0.626208 0.216764 0.537897 0.818888 0.204722 0.409444 0.927270 0.305076 0.473670 0.734591 0.329161 0.301062 0.971426 0.413458 0.545925 0.818888 0.449586 0.393387 0.883114 0.453600 0.521840 0.927270 0.461628 0.433529 0.927270 0.586067 0.722548 0.927270 0.690435 0.361274 0.927270 0.802831 0.694449 0.734591 0.710506 0.148524 0.983468
Do you mean ArmNN's C++ API? AFAIK delegate way is the one currently supported, 'standalone' one is being depracated. You can use still use delegate mechanism within C++.
@MrSherish Yes, I wrote test.cpp with ArmNN's C++ API and got inaccurate results. How do I understand 'standalone' you mentioned? Also, a naive question, what does AFAIK represent? I am a beginner and not familar with armnn yet. Thanks for your time.
Hi @liamsun2019 ,
It's nice to hear that you have achieved correct output through the use of our Arm NN TF Lite Delegate. This is currently the preferred way to use Arm NN, whether through the Python or C++ (you can use the TF Lite Delegate in C++ also) APIs. When using the TF Lite Delegate with CpuAcc or GpuAcc backends, I'd recommend not to use CpuRef as a third choice backend as the fallback to Google's reference TF Lite runtime is likely to be faster in the event of an operator not being supported on CpuAcc/GpuAcc.
Your zipped code refers to the usage of our TF Lite Parser which does not provide fallback to TF Lite runtime in the same way that the TF Lite Delegate does in the event of operator unsupportedness in Arm NN. This means that you may want to use CpuRef fallback when using the TF Lite Parser. With the respect to the issue you are having here, we would have to investigate.
Let us know more about how we can help.
Cheers, James
@MrSherish Also, a naive question, what does AFAIK represent?
AFAIK is a short for 'as far as i know' :)
Hi @james-conroy-arm , My understanding is that, no matter tflite parser or delegate, both ways are supposed to have similar inference results. According to your comment, for tflite parser use case, what would be the action if some ops are not supported in inference stage? Just ignore that or throw an exception or something else? I got no error message when using the tflite parser approach.
Hi @MrSherish, Yes, I got it. ^_^
@liamsun2019 You're correct, both the TF Lite Delegate and TF Lite Parser should have identical results. This suggests that there may be a bug in the TF Lite Parser or you may not be using it appropriately. We will need to look into this for you...
If an operator is not supported when using the TF Lite Parser API, a error is shown to the user and inference will not execute. With the TF Lite Delegate, unsupported ops fallback (i.e. delegate) to Google's TF Lite Runtime implementation. Based on what you said (inference executed successfully), all ops in your model are supported in Arm NN through the TF Lite Parser.
For more info on the TF Lite Delegate in general: https://www.tensorflow.org/lite/performance/implementing_delegate
James
Hi James, Thanks. I believe I've learned a lot. Before this issue is resolved, I will keep trying tflite delegate in C++. As https://github.com/ARM-software/armnn/issues/659 I posted, I'm being tortured by building delegate library with android NDK. ^_^
Hi @james-conroy-arm,@catcor01 , I have conducted some tests these days against several models. In particular, with a few local modifications, I generated the sample code for pose estimation. My tests showed that the inference results were incorrect, no matter c++ delegate mode or c++ parser mode. Here's my summary:
C++ parser mode a. x86_64 As I already pointed out in this issue, the x and y output were both smaller than expected. b. cortex-a55 The x looked normal, while y and score were incorrect.
C++ delegate mode a. x86_64 The x looked normal, while y and score were incorrect. b. cortex-a55 The x looked normal, while y and score were incorrect.
python delegate mode The inference results are correct.
Please refer to the attachment for model and inference results. My guess is that there may be something wrong in the computation. Thanks for your time. pose_test.zip
Hi author, I did some tests basing on ArmNN-linux-x86_64 to inference an int8 QAT tflite model. I wrote a sample code and ran the executable under ubuntu-18.04. It looked that the results were incorrect, compared to the results running on python.
Any suggestions are appreciated. Thanks.