pierluigiferrari / ssd_keras

A Keras port of Single Shot MultiBox Detector
Apache License 2.0
1.86k stars 936 forks source link

Questions related to training on a custom dataset #49

Closed adamuas closed 6 years ago

adamuas commented 6 years ago

Hi,

I got to train a model with a custom dataset, though I have run into an issue where the model will sometimes not make any predictions for a logo - when there is one in an image. I decided to measure this and called it "hit_rate", with a hit being making a prediction regardless if its correct, and on my dataset (test data), its quite low - 13.48% of the images.

At the moment I have just kept it aside to work on other things while I think about what the potential problems could be. I wanted to ask if you have experienced something similar.

My suspicion (ordered by which one I think is most probable):

Thanks in advance.

pierluigiferrari commented 6 years ago

I have very limited information about your dataset or how much you trained the model or even which model you're trying to train to begin with, so it could be many things. One possible reason if it's not making a lot of confident predictions is that it simply hasn't been trained enough. I see that all the time when I train a model from scratch: After the first couple of hundred or thousand training steps, the model predicts almost nothing with high confidence (except background), so after confidence thresholding, you're left with no predictions at all. Then it starts getting better and better, first occasionally making a correct detection here and there on easy objects, then slowly detecting harder objects.

As for your own conjectures:

adamuas commented 6 years ago

Sorry for the late response.

I am training the SSD300 model with 13 classes with roughly at least 150 classes per class for training (the rest is my testing set - i.e. at least 50 images per class) . I setup an early stopping with a patience of 100 and min_delta of 0.001 to avoid it stopping too early. Because I had limited training data, I used the training data + noise as my validation data (noise - was introduced by the image augmentations).

VGG16BASE_FREEZE = ['input_1', 'conv1_1', 'conv1_2', 'pool1',
          'conv2_1', 'conv2_2', 'pool2',
          'conv3_1', 'conv3_2', 'conv3_3', 'pool3',
          'conv4_1', 'conv4_2', 'conv4_3', 'pool4',
          'conv5_1', 'conv5_2', 'conv5_3', 'pool5']

How many epochs do you recon I should train for from your experience?

pierluigiferrari commented 6 years ago

One suggestion would be that you load weights of one of the fully trained SSD300 models rather than starting to train with the trained VGG16 weights only. Read in my first reply to #50 on how to circumvent the problem that the number of classes for your dataset (13) differs from the number of classes of the trained models (20 for Pascal VOC, 80 for MS COCO, or 200 for ImageNet).

I don't know what your logo images look like, but I assume they are very different from any of the object categories in Pascal VOC, MS COCO, or ImageNet. Nonetheless, it's probably fair to assume that any trained weights are always a better starting point to fine-tune the model on your dataset than randomly initialized weights, even if your objects of interest are very different from the objects the models were trained on. Loading trained model weights would likely improve your results tremendously and save you a lot of training time.

It's hard to say for how many training steps (let's use training steps as the metric rather than epochs) you would have to train for until you get half-decent results if you start out with only the VGG16 weights, but my best guess would be in the ball park of a few tens of thousands.

But once again, I would recommend to start out by fine-tuning one of the fully trained models. Sub-sampling the weight tensors of the classification predictor layers sounds more tedious than it is, at the end of the day it's just a bit of Numpy slicing. Or just changing the names of the classification predictor layers would be the really easy (and slightly worse) way.

adamuas commented 6 years ago

Thanks, appreciate this!

I will give it a try with the ImageNet weights as a starting point.

pierluigiferrari commented 6 years ago

Yeah, the ImageNet weights will probably be a good starting point. I've created a notebook that does the weight sub-sampling for you:

https://github.com/pierluigiferrari/ssd_keras/blob/master/weight_sampling_tutorial.ipynb

adamuas commented 6 years ago

Thanks alot @pierluigiferrari , appreciate this :+1: :+1: