Retraining SSD with Udacity's self driving car dataset

cocoza4 commented 7 years ago

Hi all,

I'm trying to apply this implementation of SSD to Udacity's self driving car dataset (see https://github.com/udacity/self-driving-car/tree/master/annotations). It turned out that the original weights (weights_SSD300.hdf5) performed better than the ones I got after retraining (with some layers' weights fixed) with the new dataset. I'm not sure I missed out something or not. Could anyone share their steps and/or point out something that I might have missed.

Here's what I did.

create a ground-truth of the same format as gt_pascal.pkl with the new dataset (bounding box data normalized and label encoded).
modify NUM_CLASSES and load ground-truth accordingly in SSD_training.ipynb
train the model.

The new model predicted poorly as shown below. Notice that each car in the image is predicted with one correct bounding box with highest confidence and a few variants of that bounding box. I guess it's the prior bounding boxes we set in prior_boxes_ssd300.pkl.

The model with the original weights however predicted better. This time, notice that only the correct bounding boxes are predicted. Its prior boxes are not shown. In both cases, I filtered out predictions with confidence < 0.7.

The validation error started at 2.56 and reduced gradually to 1.27 in the 30th epoch. I noticed that after 20th epoch, the model started to converge as the error didn't go down much.

I also have a few more questions

In SSD_training.ipynb, it seems like not every pre-trained layers are fixed freeze = ['input_1', 'conv1_1', 'conv1_2', 'pool1', 'conv2_1', 'conv2_2', 'pool2', 'conv3_1', 'conv3_2', 'conv3_3', 'pool3']#, # 'conv4_1', 'conv4_2', 'conv43', 'pool4'] lines preceded with # are ignored. Why conv4, conv5_ for example not fixed as well?
Preprocessing step uses a utility keras.applications.imagenet_utils.preprocess_input to normalize data in Generator.generate(). Is this utility made specifically for the imagenet dataset?
After reading the documentation, I don't know what neg_pos_ratio in MultiboxLoss(NUM_CLASSES, neg_pos_ratio=2.0) does. How does it affect training, etc?
Given that the Udacity's dataset is huge (almost 5 GBs in size), does training the whole model from scratch(no layers with fixed weight) make more sense?

thanks Peeranat F.

AloshkaD commented 7 years ago

@cocoza4 hi Peeranat, have you finally figured it out?

cocoza4 commented 7 years ago

Hi @AloshkaD,

Sorry this takes some time to reply. I haven't figured it out. Unfortunately, I was allocated to work on a different project now :( So I haven't worked on this for a while now. Have you figured it out? I would love to hear from you if you have.

thanks Peeranat F.

AloshkaD commented 7 years ago

@cocoza4 No I haven't! and honestly I have built my own model that is slower but I know is more accurate. I know that because we have implemented yolo9000 for a project and a FCN surpassed its accuracy by a huge margin. Let's stay in touch buddy!

rykov8 / ssd_keras

Retraining SSD with Udacity's self driving car dataset #94