ppogg / YOLOv5-Lite

🍅🍅🍅YOLOv5-Lite: Evolved from yolov5 and the size of model is only 900+kb (int8) and 1.7M (fp16). Reach 15 FPS on the Raspberry Pi 4B~
GNU General Public License v3.0
2.27k stars 407 forks source link

unexpected result #92

Closed natxopedreira closed 2 years ago

natxopedreira commented 2 years ago

Hello

Sorry if this is more a question that an issue, i'm not sure if i'm doing something wrong.

I trained a new model, using the v5lite-s weight with "person" class using coco + voc dataset, the 1k background images as negative samples and to have an input of 320. I did not change any hyperparameter and using ncnn framework

I get: mAP 05:0.95 of 0.498 mAP 05 of 0.795 Precision 0.84 Recall 0.68

I was expecting to have a better result than the provided model trained on all coco classes... but as you can see there is at least two persons not detected.

Maybe is a problem on the model conversion?

Thanks

Standard coco model

result-coco

New coco_voc model

result-coco-voc
ppogg commented 2 years ago

Hi, You are welcome to use YOLOv5-Lite. And I don’t understand what you mean, is your coco+voc total of 70,000 pieces of data?Or just use 1000 voc data for finetune on coco's pretraining model?

natxopedreira commented 2 years ago

Hi, yep my dataset is the full coco annotations and full voc for class "person" so yes arround 70000 images.

I mean i included 1000 "non person" images as negative samples, in the coco dataset there is around 1k of images without annotations that contais no person

Also i see another problem, changing the input to webcam and i get that wreid result

Screenshot-2021-12-28-10-11-23
ppogg commented 2 years ago

Generally, the detection frame with full screen is caused by the following reasons: https://github.com/ppogg/YOLOv5-Lite/issues/89 In addition, I noticed that your recall is very low. You can try lowering the threshold, or use 640640 for training and 352352 for inference.

natxopedreira commented 2 years ago

Thank you, i will train in 640x480.

I dont understand whats wrong with the param file as you pointed in #89 can you please explain? i think part of the info is lost when translating the comments.

Thanks a lot

My param last lines are

`Permute                  Transpose_469            1 1 650 output 0=1
Convolution              Conv_470                 1 1 611_splitncnn_0 652 0=18 1=1 5=1 6=2304
Reshape                  Reshape_484              1 1 652 670 0=400 1=6 2=3
Permute                  Transpose_485            1 1 670 671 0=1
Convolution              Conv_486                 1 1 631 672 0=18 1=1 5=1 6=4608
Reshape                  Reshape_500              1 1 672 690 0=100 1=6 2=3
Permute                  Transpose_501            1 1 690 691 0=1`

an opening in netron the onnx i get as output

Screenshot-2021-12-28-10-38-49

ppogg commented 2 years ago

Hi, This is a tutorial: https://blog.csdn.net/weixin_45829462/article/details/119787840

natxopedreira commented 2 years ago

Thanks!