rykov8 / ssd_keras

Port of Single Shot MultiBox Detector to Keras
MIT License
1.1k stars 551 forks source link

Retraining SSD with Udacity's self driving car dataset #94

Open cocoza4 opened 7 years ago

cocoza4 commented 7 years ago

Hi all,

I'm trying to apply this implementation of SSD to Udacity's self driving car dataset (see https://github.com/udacity/self-driving-car/tree/master/annotations). It turned out that the original weights (weights_SSD300.hdf5) performed better than the ones I got after retraining (with some layers' weights fixed) with the new dataset. I'm not sure I missed out something or not. Could anyone share their steps and/or point out something that I might have missed.

Here's what I did.

  1. create a ground-truth of the same format as gt_pascal.pkl with the new dataset (bounding box data normalized and label encoded).
  2. modify NUM_CLASSES and load ground-truth accordingly in SSD_training.ipynb
  3. train the model.

The new model predicted poorly as shown below. image Notice that each car in the image is predicted with one correct bounding box with highest confidence and a few variants of that bounding box. I guess it's the prior bounding boxes we set in prior_boxes_ssd300.pkl.

The model with the original weights however predicted better. image This time, notice that only the correct bounding boxes are predicted. Its prior boxes are not shown. In both cases, I filtered out predictions with confidence < 0.7.

The validation error started at 2.56 and reduced gradually to 1.27 in the 30th epoch. I noticed that after 20th epoch, the model started to converge as the error didn't go down much.

I also have a few more questions

  1. In SSD_training.ipynb, it seems like not every pre-trained layers are fixed freeze = ['input_1', 'conv1_1', 'conv1_2', 'pool1', 'conv2_1', 'conv2_2', 'pool2', 'conv3_1', 'conv3_2', 'conv3_3', 'pool3']#, # 'conv4_1', 'conv4_2', 'conv43', 'pool4'] lines preceded with # are ignored. Why conv4, conv5_ for example not fixed as well?

  2. Preprocessing step uses a utility keras.applications.imagenet_utils.preprocess_input to normalize data in Generator.generate(). Is this utility made specifically for the imagenet dataset?

  3. After reading the documentation, I don't know what neg_pos_ratio in MultiboxLoss(NUM_CLASSES, neg_pos_ratio=2.0) does. How does it affect training, etc?

  4. Given that the Udacity's dataset is huge (almost 5 GBs in size), does training the whole model from scratch(no layers with fixed weight) make more sense?

thanks Peeranat F.

AloshkaD commented 7 years ago

@cocoza4 hi Peeranat, have you finally figured it out?

cocoza4 commented 7 years ago

Hi @AloshkaD,

Sorry this takes some time to reply. I haven't figured it out. Unfortunately, I was allocated to work on a different project now :( So I haven't worked on this for a while now. Have you figured it out? I would love to hear from you if you have.

thanks Peeranat F.

AloshkaD commented 7 years ago

@cocoza4 No I haven't! and honestly I have built my own model that is slower but I know is more accurate. I know that because we have implemented yolo9000 for a project and a FCN surpassed its accuracy by a huge margin. Let's stay in touch buddy!