Open Peilin-Yang opened 7 years ago
Hi, thanks :) Took me a bit of time to make it, but I guess it is worth it!
The training script is not entirely stable. You may want to try to change a bit the loss function in ssd_vgg_300.py
, and in particular the alpha
and negative_ratio
parameters.
Did you use a pre-trained checkpoint for training or did you train from scratch? I guess in the later case, it may be quite hard to converge. I just added to the readme a small description of how to fine-tune using VGG weights. I hope it can help you.
Thanks for the additional info!
I guess for the checkpoint file I need to make it as ssd_vgg_300
?
Do you mean rename it? You just need to download the checkpoint of the VGG-16 model, and use in the training command. Hopefully it should work!
Hey, thanks for sharing this implementation!
I have few questions.
Have you trained SSD300 on VOC data initialized from VGG16 feature extractor in your implementation? The parameters in checkpoint looks like converted version of caffe parameters. (correct me if I'm wrong) I'm running a simple experiment. SSD300 VOC2012 dataset initialized from ssd_300_vgg.ckpt
and after 100 training steps the prediction results are worse than those I get from ssd_300_vgg.ckpt
checkpoint.
Thanks.
Hello,
Yes, the checkpoints are directly converted from the Caffe implementation. The training script is not yet as advanced as the latter one, which explains your results (I got that too). I'll try to investigate a bit more how to improve it, including
Tell me if you have any ideas how to make it better!
Hi, thanks for the answer.
I have few questions in your code:
nets.ssd_common.tf_ssd_bboxes_encode_layer
feat_cy = (feat_cy - yref) / href / prior_scaling[0]
feat_cx = (feat_cx - xref) / wref / prior_scaling[1]
feat_h = tf.log(feat_h / href) / prior_scaling[2]
feat_w = tf.log(feat_w / wref) / prior_scaling[3]
Why is it necessary to scale the values in this way?
no_label
class? Right?
mask = tf.greater(jaccard, feat_scores)
# mask = tf.logical_and(mask, tf.greater(jaccard, matching_threshold))
mask = tf.logical_and(mask, feat_scores > -0.5)
mask = tf.logical_and(mask, label < num_classes)
imask = tf.cast(mask, tf.int64)
fmask = tf.cast(mask, dtype)
Thanks for the thoughts and advises.
Hi ! For the scaling, the idea is try to scale such that all error terms (classification + position + size) have roughly the same scaling. Otherwise, the training would tend to over-optimise one component and not the others.
Exactly, the negative values are used to mark the anchors with no annotations. The idea comes from the KITTI dataset where some part of the dataset images are signaled as being not labelled : there may be a car/person/... in these parts, but it has not been segmented. If you don't keep track of these parts, you may end up with the SSD model detecting objects not annotated, and the loss function thinking it is False positive, and pushing for not detecting it. Which is not really what we want ! So basically, I set up a mask such that the loss function ignores the anchors which overlap too much with parts of images no-annotated. Hope it is a bit more clear! I guess I should add a bit of documentation about that!
Hey! Thanks for explanation!
No annotation label used to avoid false positive on the image regions while training. It's clear.
I'm working on training SSD with empty frames, covered with background only. And as far as I understand, it would be sufficient to supply empty lists of labels and bboxes. Which should result in contribution to loss only for negative xentropy part.
Have you tried to training in this setting?
@Peilin-Yang I train my own model on widerface dataset. It only contains 2 classes: Background (0) and the face object (1). But I have a big problem: I don‘t know why. How do you modify the code?Thanks!
@chenweiqian Hi, honestly I had the similar result as yours:(
@Peilin-Yang what's your loss? My loss is always above 5 and mAP is close to zero. Is my dataset wrong?
@chenweiqian Yeah...I think mine was similar to yours. Sorry I do not know how to help on this issue since I am not an expert.
@Peilin-Yang Thanks for your answer!
I have similar errors to @chenweiqian and @Peilin-Yang when training on KITTI dataset (importing the KITTI interface written by @balancap from SDC-Vehicle-Detection/datasets/kitti*
).
The mAP I get evaluating using eval_ssd_network.py
on KITTI is ~31%.
Low prediction scores and poor localization can be seen in this sample image:
How did you come up with the prior scaling values?
@Peilin-Yang I have issue similar to you。 1、only two classes,background(0) and my object(1),my object is very small (average 5% of the whole img)。 2、my loss cannot converge ,always above 4.0 。
have you solved it? can you tell me the way?
@seasonyang @Peilin-Yang @balancap This SSD project is so good, I learn a lot in it. However, I met some problems when training my own dataset. I also have 2 classes, One is "ID number", the other is "Name". I then change the tag in "ssd_common" and make my own dataset explanation python script on /dataset. Everything go smooths if I don't change the num_classes, at least the code can run. However, if I change the num_classes to 2, then the error pop up claims that "something wrong about the tensor shape", here is part of what I get:
Caused by op u'save_1/Assign_4', defined at:
File "train_ssd_network.py", line 402, in
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [16] rhs shape= [84] [[Node: save_1/Assign_4 = Assign[T=DT_FLOAT, _class=["loc:@ssd_300_vgg/block10_box/conv_cls/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](ssd_300_vgg/block10_box/conv_cls/biases, save_1/RestoreV2_4)]]
Would you please tell me what is going on? Or just tell me why you can make it success. Thank you so much!
@HoracceFeng I have the same problem..... Did you resolve this problem already?
@oowe @HoracceFeng i have the same problem too. have you found a solution?
@mosab-r @oowe Hi, the problem had been solved. In this code, the background should also be counted as a class, which means if you have 4 classes in total, you should change the num_classes = 5 but not 4.
@HoracceFeng @oowe @mosab-r @seasonyang I have the same issue when training only one class. When I set num_classes to 2, the model seems not to be learning. I trained on two of my own datasets and on VOC07, which I selected only to train person.
At first I set lr = 0.001, loss drops rapidly in the first 2k steps then literally stops. So I tried raising lr to 0.01, 0.1, 0.5, 0.9 and 0.99. Then loss starts dropping again. But looking into the histograms, weights tends to 0. mAP stays at 0.
Please let me know if you guys solve the problem.
I am currently training on an adapted version. I rewrote SSD_tensorflow_VOC by @LevinJ, which he adapted from this repo, to modular_SSD_tensorflow. I am currently training it on VOC0712trainval and hope to finetune it for my purpose. The current problem is that it only utilize one GPU for training. I haven't tried to train directly on one class. Please try it out and tell me your results.
@wangsihfu @balancap I have the same question with you,I am new to detection and have read Yolo and SSD paper.I have understood some part of this code,but to understand all of it is still difficult for me.Now I just want to try to run the code on my own dataset with 2 classes and shape (512,512)
I want to trian the network,how should I do?Which part of the code should I change?And how to make my own tfrecord?
Thanks a lot!
Hi @Peilin-Yang , I would like to apply SSD to detect small objects in the images. I am jsut wondering if you have get the code working on detecting small objects? I have tried the method in the following:
https://github.com/balancap/SSD-Tensorflow/issues/222
However, it looks like it doesn't work for me. When I visualize the loss in tensorboard, both "cross_entropy_positive loss" and "localization loss" stay on zero during the training process. Do you have any suggestions? Appreciate it!
Thank you!
attach the loss during my training process:
@hbdong77 , I don't have the solution. Do you have any ideas? Thanks.
I find that if delete already existing logs , the problem that lhs shape not equal rhs shape will be solved @HoracceFeng
i have used the ssd mobilenet v1 model for training on new dataset (oranges) . after training results are good for oranges but model detect the apple as orange how to free the other classes? how can i get the accuracy graph and lose graph???? and one thing more, continually i get 2 or 3 lose?
I have similar errors to @chenweiqian and @Peilin-Yang when training on KITTI dataset (importing the KITTI interface written by @balancap from
SDC-Vehicle-Detection/datasets/kitti*
).The mAP I get evaluating using
eval_ssd_network.py
on KITTI is ~31%.Low prediction scores and poor localization can be seen in this sample image:
Can you tell me the version of python and tensorflow? I need a right and proper environment,thanks.
Hey! Thanks for explanation!
No annotation label used to avoid false positive on the image regions while training. It's clear.
I'm working on training SSD with empty frames, covered with background only. And as far as I understand, it would be sufficient to supply empty lists of labels and bboxes. Which should result in contribution to loss only for negative xentropy part.
Have you tried to training in this setting? What environment is your using?python and tensorflow version
Hi, First, thanks for your hard work to implement SSD on TensorFlow, BIG CREDIT! Now I want to train my own model on another data set.