Bad training results after yolov3 transfer learning with reduced coco2017 dataset

Bienqq commented 5 years ago

Hello,

I was trying to perform yolov3 transfer learning, I started from train from COCO weights as described in README. In my coco2017 dataset I have only vehicles - 5 classes. In orginal model there are 80 classes. I used scripts from this repository to perform training and model export with following configuration:

config.py

from easydict import EasyDict as edict

__C                             = edict()
# Consumers can get config by: from config import cfg

cfg                             = __C

# YOLO options
__C.YOLO                        = edict()

# Set the class name
__C.YOLO.CLASSES                = "./data/classes/my_classes.names"
__C.YOLO.ANCHORS                = "./data/anchors/coco_anchors.txt"
__C.YOLO.MOVING_AVE_DECAY       = 0.9995
__C.YOLO.STRIDES                = [8, 16, 32]
__C.YOLO.ANCHOR_PER_SCALE       = 3
__C.YOLO.IOU_LOSS_THRESH        = 0.5
__C.YOLO.UPSAMPLE_METHOD        = "resize"
__C.YOLO.ORIGINAL_WEIGHT        = "./checkpoint/my_best_training_checkpoint.ckpt"
__C.YOLO.DEMO_WEIGHT            = "./checkpoint/my_output.ckpt"

# Train options
__C.TRAIN                       = edict()

__C.TRAIN.ANNOT_PATH            = "./data/dataset/train_labels.txt"
__C.TRAIN.BATCH_SIZE            = 6
__C.TRAIN.INPUT_SIZE            = [320, 352, 384, 416, 448, 480, 512, 544, 576, 608]
__C.TRAIN.DATA_AUG              = True
__C.TRAIN.LEARN_RATE_INIT       = 1e-4
__C.TRAIN.LEARN_RATE_END        = 1e-6
__C.TRAIN.WARMUP_EPOCHS         = 2
__C.TRAIN.FISRT_STAGE_EPOCHS    = 20
__C.TRAIN.SECOND_STAGE_EPOCHS   = 30
__C.TRAIN.INITIAL_WEIGHT        = "./checkpoint/yolov3_coco_demo.ckpt"

# TEST options
__C.TEST                        = edict()

__C.TEST.ANNOT_PATH             = "./data/dataset/test_labels.txt" 
__C.TEST.BATCH_SIZE             = 2
__C.TEST.INPUT_SIZE             = 544
__C.TEST.DATA_AUG               = False
__C.TEST.WRITE_IMAGE            = True
__C.TEST.WRITE_IMAGE_PATH       = "./data/detection/"
__C.TEST.WRITE_IMAGE_SHOW_LABEL = False
__C.TEST.WEIGHT_FILE            = "./checkpoint/yolov3_test_loss=9.2099.ckpt-5"
__C.TEST.SHOW_LABEL             = False
__C.TEST.SCORE_THRESHOLD        = 0.3
__C.TEST.IOU_THRESHOLD          = 0.45

my_classes.names

car
truck
motorcycle
bicycle
bus

Training, convert_weights, and export freeze_graph was successful and I received my_yolov3.pb file which next I used in my video detector.

But the detection results are really bad, they are definitely worse than using initial pretrained model... The detector can not find any objects.

Is it possible that such poor results are caused by a reduction in the number of classes from 80 to 5 ? Needs some changes in configuration ? Or any ideas what went wrong?

shoutOutYangJie commented 5 years ago

in the training process, have the test loss ever been Nan?

Bienqq commented 5 years ago

No, the training itself was without problems. I tried again and the results are quite good, it's possible that earlier I made some mistake.

After completing the training, I received many files with checkpoints, generated after each epoch. Which one is the best one? This file where total_loss is the smallest or the latest one ?

CNUyue commented 5 years ago

I use my trained model to generate .pb files(use convert_weights).The following questions occur, do you know what is going on?Thanks

Traceback (most recent call last): File "/home/lmy/Downloads/tensorflow-yolov3-master-kitti-dense/convert_weight.py", line 55, in raise RuntimeError RuntimeError

16534165 commented 4 years ago

I use my trained model to generate .pb files(use convert_weights).The following questions occur, do you know what is going on?Thanks

Traceback (most recent call last): File "/home/lmy/Downloads/tensorflow-yolov3-master-kitti-dense/convert_weight.py", line 55, in raise RuntimeError RuntimeError

same,how to fix it

YunYang1994 / tensorflow-yolov3

Bad training results after yolov3 transfer learning with reduced coco2017 dataset #202