david8862 / keras-YOLOv3-model-set

end-to-end YOLOv4/v3/v2 object detection pipeline, implemented on tf.keras with different technologies
MIT License
639 stars 222 forks source link

Question about result of yolov4-cspdarknet53 model #167

Open ermubuzhiming opened 3 years ago

ermubuzhiming commented 3 years ago

python tools/model_converter/convert.py --yolo4_reorder cfg/yolov4.cfg weights/yolov4.weights weights/yolov4.h5做预训练权重转换,之后开始训练。今天发现不能导入权重。 python train.py --model_type=yolo4_darknet --anchors_path=configs/yolo4_anchors.txt --weights_path=./weights/cspdarknet53.h5 --annotation_file=2007_train_test.txt --val_annotation_file=2007_val.txt --classes_path=./configs/car_classes.txt --batch_size=4 --freeze_level=1 --total_epoch=115 --enhance_augment=mosaic --label_smoothing=0.01 --save_eval_checkpoint --data_shuffle --transfer_epoch=5 报错如下: image 因为第一次还可以正常训练,第二次就不行了。尝试导入第一次训练中的其他权重,发现可以正常运行。 不知道什么原因?用的是我自己的数据集和分类数,第二次没有改动过代码,只修改过权重保存路径,NMS的参数值,命令行不变,然后重新开始训练,报错。 thanks for your reply!

ermubuzhiming commented 3 years ago

I have solved the problem by removing the cmd --weights_path=./weights/cspdarknet53.h5 but another error occured: The inference result is different. test image is on below bdd100k

1. yolo.py

python yolo.py --image --model_type=yolo4_darknet --weights_path=logs/002-1/dump_trained_final.h5 --anchors_path=configs/yolo4_anchors.txt --classes_path=configs/car_classes.txt !!!!!no boxes in the image!!!!! image

2. inference demo in the repo by the tool called openvino

python3 yolov4-inference-demo.py -m=../david-cspdarknet53-fp16.xml -at yolov4 -i ../1.jpg -d CPU --label=car_classes.txt -pc -r !!!!! too many boxes in the image!!!!! image

I wanna know why the yolo.py is 0 result

david8862 commented 3 years ago

@ermubuzhiming did you try to evaluate the model during training? What the mAP for training evaluation?

ermubuzhiming commented 3 years ago

I did not evaluate it when training cause xshell are not installed for showing images result by me. eval.py was used to evaluted with the same weights files and my dataset is the format of VOC. python3 eval.py --model_path=./logs/002-1/dump_trained_final.h5 --anchors_path=configs/yolo4_anchors.txt --classes_path=./configs/car_classes.txt --annotation_file=2007_val.txt --save_result results are like this,TOO MANY BOXES and WRONG DETECTED RESULTS 25 if i made the wrong cmd?can you help me check if the parameters are suitable or not ? Thanks for your explanaiton in detail cause I am not familiar with the model. Waiting for your early reply!

david8862 commented 3 years ago

I did not evaluate it when training cause xshell are not installed for showing images result by me. eval.py was used to evaluted with the same weights files and my dataset is the format of VOC. python3 eval.py --model_path=./logs/002-1/dump_trained_final.h5 --anchors_path=configs/yolo4_anchors.txt --classes_path=./configs/car_classes.txt --annotation_file=2007_val.txt --save_result results are like this,TOO MANY BOXES and WRONG DETECTED RESULTS 25 if i made the wrong cmd?can you help me check if the parameters are suitable or not ? Thanks for your explanaiton in detail cause I am not familiar with the model. Waiting for your early reply!

By default eval.py use a very low conf_threshold (0.001) for bbox filtering to get better mAP result, which leads to many FP predictions. You can try to change to more valid value like "--conf_threshold=0.1"

ermubuzhiming commented 3 years ago

I added --conf_threshold=0.6, It shows better while there are wrong detection results. AND I wana know why the result is bad using yolo.py. PS: Before train and do image inference (train.py and yolo.py), I have already changed the values( max_boxes=20, confidence=0.6, iou_threshold=0.5)
in the yolo3/postprocess.py and postprocess_np.py image If is the epoch is small,transfer_learning=20 and total_epoch=100 are ok ? Thanks for your reply!

david8862 commented 3 years ago

I added --conf_threshold=0.6, It shows better while there are wrong detection results. AND I wana know why the result is bad using yolo.py. PS: Before train and do image inference (train.py and yolo.py), I have already changed the values( max_boxes=20, confidence=0.6, iou_threshold=0.5) in the yolo3/postprocess.py and postprocess_np.py image If is the epoch is small,transfer_learning=20 and total_epoch=100 are ok ? Thanks for your reply!

sounds weired if you can get reasonable result with eval.py but not with yolo.py, on the same model and same image. These two just use the same preprocess/postprocess implementation. Maybe you need to show & compare the raw output tensor.

ermubuzhiming commented 3 years ago

I wanna know which output tensor? You mean I can show the output tensor from the eval.py nd yolo.py? What is more ,I can see the structure when training, It can be useful? image Thanks a lot!

ermubuzhiming commented 3 years ago

I may know the reason.

  1. eval.py will show the ground truth in black bbox, but yolo.py wont.
  2. I train the model with more epoch and the multi-bbox error will be better. The image is the same as above and test method eval.py is the same, but better result like below. 25 IF the reasons are true, why not enough epoches will make error like multi-bbox when testing with eval.py? thanks for your reply!
david8862 commented 3 years ago

I may know the reason.

  1. eval.py will show the ground truth in black bbox, but yolo.py wont.
  2. I train the model with more epoch and the multi-bbox error will be better. The image is the same as above and test method eval.py is the same, but better result like below. 25 IF the reasons are true, why not enough epoches will make error like multi-bbox when testing with eval.py? thanks for your reply!

great news. but the model behavior before fully convergence is not easy to analysis. generally you can try to check the loss trend for every part (box/confidence/class) during training