naisy / train_ssd_mobilenet

Train ssd_mobilenet of the Tensorflow Object Detection API with your own data.
MIT License
61 stars 24 forks source link

Problems training a custom dataset #2

Open saihv opened 6 years ago

saihv commented 6 years ago

Hello,

Thanks for this detailed set of instructions! I tried to follow them for training a custom dataset of mine, but I was running into some problems, so I was hoping you might have suggestions.

My dataset has 12 classes, and it was originally being used with YOLO, so I used this converter to create train and validation TFRecord files. (Please note here that the bounding boxes are preprovided, so I was not hand labeling them) Once that was done, I followed the instructions in this repo to start the training. I ensured that my label text was correct, starting from class ID 1, etc. Same as in this repo, I am using ssd-mobilenet-v1, and changed the config file as needed. I was working with Tensorflow 1.4.1 and v1.5 of the object detction API.

Nothing seems obviously wrong in the training window, but the network doesn't really learn the objects correctly at all. It is evident from these pictures from Tensorboard.

screenshot from 2018-04-28 08-57-20

screenshot from 2018-04-28 08-58-05

It only seems to be detecting 2/12 classes somewhat decently, and even then gets confused on the classification a lot even after ~50k steps. The loss also seems to be fluctuating and not really decreasing. Would you happen to have any suggestions for fixes or improvement? Thank you!

naisy commented 6 years ago

Hi saihv,

I suggest that you train until 500k steps. Still if accuracy does not improve, it may be better to prepare new data.

Probably, I think more steps will be needed. My 4 class data learned 204k steps.

Regarding accuracy, data is the most important. However, when you can not modify the data, it can be solved with many training steps.