Closed michaelnguyen11 closed 2 years ago
@michaelnguyen11 would you mind sharing your model?
Hi @sunshinemyson ,
Here is my model : https://drive.google.com/file/d/18MrSmLv1T5rKEknBj2C5In6jmZ7bhAuW/view?usp=sharing . You can refer to this repository to do post-processing : https://github.com/david8862/keras-YOLOv3-model-set/tree/master/inference/tflite
Hi @sunshinemyson ,
I've install TIM-VX latest code with i.MX BSP 5_10_52-2_1_0, but the result is still the same : with VX Delegate the model can't detect object, in spite of the NN API can.
@michaelnguyen11 ,
We observe similar issue as you. The team is checking it right now.
@michaelnguyen11 , after check the model, we found lots of dequantize op with constant inputs. We are not supported constant folding so far, such constant input cannot handle properly. Could you try to refine the models without such dequantize operation?
Hi @sunshinemyson ,
Sorry for late response.
I refine the models, so now the model contains only 1 quantize
op at input and 1 dequantize
op at output.
However, the result is still the same : with VX Delegate the model can't detect object, in spite of the NN API can.
Please help to check. Thanks in advance !!!
https://drive.google.com/drive/u/0/folders/1RVIRvLA7FN8iLfHAFqJiRedJYBLTdbk3
Hi, could be related ?
@bkovalenkocomp ,
I didn't see any constant input for the graph in your model. You can try layer dump for debugging
https://github.com/VeriSilicon/TIM-VX/issues#issuecomment-986138283
Hi @sunshinemyson ,
Sorry for late response.
I refine the models, so now the model contains only 1
quantize
op at input and 1dequantize
op at output.However, the result is still the same : with VX Delegate the model can't detect object, in spite of the NN API can.
Please help to check. Thanks in advance !!!
https://drive.google.com/drive/u/0/folders/1RVIRvLA7FN8iLfHAFqJiRedJYBLTdbk3
Yes, We will check it asap.
Hi @michaelnguyen11
I'm not 100% sure but I might know what is going on.
I'm having a similar issue with my YOLOv4-Tiny model which has two output tensors. I've quantized it and deployed to the i.MX8MPlus.
I can tell you that, when using this delegate, one of the output tensors does not deliver consistent results. The other output tensor gives good results. In other words, the model only detects "big objects" and misses the small ones when using the NPU.
I've analyzed the model and I found out there is a "Resize"/"Upsample" layer that is not working properly. This layer is located in parallel after the first output which is the one that delivers valid results. I believe other YOLO models follow this pattern.
In my case, the Resize/Upsample layer simple transforms a 1x13x13x128 tensor to a 1x26x26x128 tensor. For some reason, this delegate is converting this layer into two sequential Deconvolution layers (1x13x13x128 to 1x13x13x512 to 1x26x26x128). I suspect this is causing this issue.
I dumped the profiling info of this two layers: vx_debug.txt
Unfortunately, I have no idea how to fix this.
Thanks.
Hi @michaelnguyen11
I'm not 100% sure but I might know what is going on.
I'm having a similar issue with my YOLOv4-Tiny model which has two output tensors. I've quantized it and deployed to the > i.MX8MPlus.
I can tell you that, when using this delegate, one of the output tensors does not deliver consistent results. The other output tensor gives good results. In other words, the model only detects "big objects" and misses the small ones when using the NPU.
I've analyzed the model and I found out there is a "Resize"/"Upsample" layer that is not working properly. This layer is located in parallel after the first output which is the one that delivers valid results. I believe other YOLO models follow this pattern.
In my case, the Resize/Upsample layer simple transforms a 1x13x13x128 tensor to a 1x26x26x128 tensor. For some reason, this delegate is converting this layer into two sequential Deconvolution layers (1x13x13x128 to 1x13x13x512 to 1x26x26x128). I suspect this is causing this issue.
I dumped the profiling info of this two layers: vx_debug.txt
Unfortunately, I have no idea how to fix this.
Thanks.
thats interesting, I have upsample layers in my model too
Hi,@dmartinez-quercus @bkovalenkocomp
The vx-delegate will transform resize to deconvolution in some cases, you guys can turn off the feature and try it again.
the trigger is at https://github.com/VeriSilicon/tflite-vx-delegate/blob/c862e75266bcccae1fdd2d6b91c9017c5d04a918/op_map.cc#L1020. Set it to false.
I hope it can solve your problem.
Hi,@dmartinez-quercus @bkovalenkocomp
The vx-delegate will transform resize to deconvolution in some cases, you guys can turn off the feature and try it again.
the trigger is at
https://github.com/VeriSilicon/tflite-vx-delegate/blob/c862e75266bcccae1fdd2d6b91c9017c5d04a918/op_map.cc#L1020 . Set it to false.
I hope it can solve your problem.
Hi @liyuenan2333
I've tried what you suggested. The vx-delegate now interprets this layer as "resize" instead of two "deconvolution" layers.
I'm seeing an important improvement. Smaller objects are now detected the same way NNAPI does. Performance is good too.
However, now VIV_VX_DEBUG_LEVEL is printing the following message:
Kernel "com.vivantecorp.extension.evis.resize_nearest_U8toU8_op" does not exist
Not sure if this is relevant but so far it's working better.
Thank you very much.
@dmartinez-quercus You dont't have to care about this log, it doesn't matter at all.
Hi @dmartinez-quercus , @liyuenan2333 ,
Sorry for late response, I've just back from vacation.
I changed bool can_resize_to_transposeconv = false
, updated TIM-VX to latest commit, which includes #250 MR.
The VX Delegate can only detect very large object now, but can't detect small object, compare to NNPAI. For example: 1/ with NNAPI Delegate: 2/ With VX Delegate:
The VX Delegate works better but I think the problem has not completely solved yet.
Hi @dmartinez-quercus , @liyuenan2333 ,
Sorry for late response, I've just back from vacation.
I changed
bool can_resize_to_transposeconv = false
, updated TIM-VX to latest commit, which includes #250 MR.The VX Delegate can only detect very large object now, but can't detect small object, compare to NNPAI. For example: 1/ with NNAPI Delegate: 2/ With VX Delegate:
The VX Delegate works better but I think the problem has not completely solved yet.
Hmmm.. I did not try the latest commit. I just simply set bool can_resize_to_transposeconv = false
in the TIM-VX version I already had (early Dec 21) and it worked for the YOLOv4-Tiny (2 outputs). However, you may be probably using a different YOLO model. If so, you might need to set the VX debugging/profiling environmental variables before running the tests in order to carefully check for any layer inconsistency when TIM-VX builds the model graph like I did.
Just for completeness, Ill mention my findings here:
Test for my model (INT8 graph): 2 faces on the image - big one and small one.
on x86 probability scores were: 0.99 for big one and 0.98 for small one on a311d npu scores are 0.74 for big one and 0.98 for small one
Other outputs look fine (landmarks, features), but the difference in the scores is suspicions. Maybe there is a bug in the softmax layer?
update:
for my case changing can_resize_to_transposeconv
makes no difference, I bet on softmax bug ;-)
Softmax bug has been fixed in TIM-VX
Dear Supporters,
I'm using i.MX 8M Plus EVK board with BSP 5_10_52-2_1_0 which use TIM-VX 1.1.32 version (BSP Yocto build cloned TIM-VX at here : https://github.com/NXPmicro/tim-vx-imx)
When I use VX Delegate to delegate YoloV3 TFLIte model , it produced wrong output. However with TFLIte NNAPI Delegate, the model works correctly.
For example : with same model, same pre-process and post-process.
with NNAPI delegate:
with VX delegate:
With VX Delegate, the FPS is 3 times faster than NNAPI Delegate, so it would be great if the VX Delegate could produce correctly result.
Did you meet this kind of issue before ? Could you guide me fix this issue ?
Many thanks in advance !
Regards, Hiep