The Performance of the Deployed Model on Android is Far from What on the PC

pytorch / android-demo-app

PyTorch android examples of usage in applications

1.47k stars 606 forks source link

The Performance of the Deployed Model on Android is Far from What on the PC #195

Closed joeshow79 closed 2 years ago

joeshow79 commented 2 years ago

Hi, I trained one model to detect the steel rebar base on yolov5x model. The testing result is good on PC. And I followed the guide (https://github.com/pytorch/android-demo-app/pull/185) to convert the model to torchscript model (ptl) and integrate it to the demo app. Then the demo app could work and output the result, but there is huge gap between the results on PC and app, see below pic for comparison.

Result on PC (confidence thresh is 0.25) Result on App(confidence thresh is 0.2) I also tuned the --optimze when export the model python3 /workspace/src/github/yolov5/export.py --weight runs/train/exp32/weights/best.pt --include torchscript --optimize But there is no significant difference after the tuning. So far have no more clue to figure out ...

Any tips or suggestion is appreciated, thanks!

joeshow79 commented 2 years ago

Found there are some relation with the code

nNmsLimit in PrePostProcess.java default is 15, which is not good for mass object detection.
Confidence filter and NMS filter share the same threshold (mThreshold in PrePostProcess.java) which may not good to tune the performance.

After fix the above 2 issues, the output got better, but still not as good as the result from the inference on PC, 3 significant issues remain

The confidence of the proposal BBox is very small (distribute from 0.) compared to the inference result on PC
Many False Positive with hight confidence (>0.9) and small rect (suspect the rect size is negative), plot in the fiture with green circles
Still many False Negative(recall is not good)

joeshow79 commented 2 years ago

Finally work out the reason,

The dimension of the output predicted tensor is wrong, the class is 1, so the dimension for 1 predict is 6 instead of 5(wrong)
Increase the maximum detection number to 1000(from 15) And one minor change to make the final conf by conf = object_conf * cls_conf

Karl8 commented 2 years ago

Thank you joeshow79. I had the same issue, and I solved it by:

Increace the mNmsLimit to 300. 2.Padding the input image with rgb=(114, 114, 114) before resizing it, like the yolov5 PC version in dataset.py:222. The function letterbox() will resize and pad image while meeting stride-multiple constraints