The training results are not good

zzh8829 / yolov3-tf2

YoloV3 Implemented in Tensorflow 2.0

MIT License

2.51k stars 907 forks source link

The training results are not good #221

Open SunshineJZJ opened 4 years ago

SunshineJZJ commented 4 years ago

Hi I just follow the tutorial, about 20 epoch, the training earlystop, the I run detect.py , but the result is not good,. the I use kmeans to find the anchor:[(14, 24), (24, 65), (49, 41), (57, 108), (87, 212), (134, 118), (166, 295), (284, 193), (355, 366)], I trained again about 19epoch, but the result is also bad, output (copy)

i found that about loss function:

xy_loss = obj_mask box_loss_scale tf.reduce_sum(tf.square(true_xy - pred_xy), axis=-1) wh_loss = obj_mask box_loss_scale tf.reduce_sum(tf.square(true_wh - pred_wh), axis=-1) obj_loss = binary_crossentropy(true_obj, pred_obj) obj_loss = obj_mask obj_loss + (1 - obj_mask) negative_mask obj_loss class_loss = obj_mask sparse_categorical_crossentropy( true_class_idx, pred_class)

YOLOv3 loss calculation formula: TIM截图20200401112521

in the code，there are not have lambda_obj, lambda_noobj, lambda_class, this is why?

Is there any way to get better results? Can the author provide your training model? Thanks

zzh8829 commented 4 years ago

Hi I would say the result is pretty good from your image. I believe lambda_obj,noobj,class parameters were present in previous version of yolo but was removed in yolov3. At least i didn't see it in the paper or original darknet implementation. In order to achieve best training result you need to fine tune the hyperparameters and the transfer learning process. I did not train the model myself and the examples i gave were definitely not the best way to train. I recommend incorporating hyperparameter search as well as image augmentation in your training process.

SunshineJZJ commented 4 years ago

Hi I would say the result is pretty good from your image. I believe lambda_obj,noobj,class parameters were present in previous version of yolo but was removed in yolov3. At least i didn't see it in the paper or original darknet implementation. In order to achieve best training result you need to fine tune the hyperparameters and the transfer learning process. I did not train the model myself and the examples i gave were definitely not the best way to train. I recommend incorporating hyperparameter search as well as image augmentation in your training process.

I see! Thank you for your reply!

tehtea commented 3 years ago

Hi @SunshineJZJ , did you manage to improve the training results in the end? I followed the tutorial too and ran the following command:

python train.py \
    --dataset ./data/voc2012_train.tfrecord \
    --val_dataset ./data/voc2012_val.tfrecord \
    --classes ./data/voc2012.names \
    --num_classes 20 \
    --mode fit --transfer none \
    --batch_size 16 \
    --epochs 100 \

But there was early stopping at epoch 19 and I ran the following command to evaluate the model:

python detect.py \
    --classes ./data/voc2012.names \
    --num_classes 20 \
    --weights ./checkpoints/yolov3_train_19.tf \
    --image ./data/meme.jpg \
    --yolo_iou_threshold 0.1 \
    --yolo_score_threshold 0.1

And this was the result:

Have not tried changing the anchors to the ones you found after running KMeans though.