aloyschen / tensorflow-yolo3

tensorflow implementation of yolov3
143 stars 58 forks source link

Converged YOLO loss #13

Closed viplix3 closed 5 years ago

viplix3 commented 5 years ago

@aloyschen I am trying to implement YOLOv3 using some ideas and modules from your code. I can't get the loss of my model to converge. My training loss is hovering around 10 even after 100 epochs on a dataset with 200 images of raccoon. I have disected the model to contain only 2 scales and I am using the pre trained darknet-53 weights with no optimization running over the feature extractor.

I was wondering on which dataset you tried the training of the model and what was the number of epochs, what training loss was like, and other related information for which your model converged and started giving some reasonable predictions.

All the training details are provided in the following cfg

num_parallel_calls = 4
input_shape = 416
max_boxes = 20
jitter = 0.3
hue = 0.1
sat = 1.0
cont = 0.8
bri = 0.1
norm_decay = 0.99
weight_decay = 5e-4
norm_epsilon = 1e-4
pre_train = True
train_last_layers_only = False
num_anchors = 6
num_classes = 1
training = True
disect = True
disect_scale = 1
ignore_thresh = .5
learning_rate = 1e-4
train_batch_size = 10
val_batch_size = 4
# train_num = 4761
# val_num = 250
train_num = 190
val_num = 10
Epoch = 200
obj_threshold = 0.3
nms_threshold = 0.5
gpu_index = "0"
log_dir = './logs'
data_dir = './dataset/'
model_dir = './converted/'
yolov3_cfg_path = './darknet_data/yolov3.cfg'
yolov3_weights_path = './darknet_data/yolov3.weights'
darknet53_weights_path = './darknet_data/darknet53.weights'
anchors_path = './yolo_anchors.txt'
classes_path = './model_data/raccoon_classes.txt'
train_annotations_file = './train.txt'
val_annotations_file = './val.txt'
output_dir = './tfrecords/'

Tensorboard screenshots are attached below

screenshot_20181129_144324

screenshot_20181129_144318

viplix3 commented 5 years ago

Solved it myself https://github.com/viplix3/YOLO Closing.