Model overfitting when training with new data on pre-trained model

PRBonn / lidar-bonnetal

Semantic and Instance Segmentation of LiDAR point clouds for autonomous driving

http://semantic-kitti.org

MIT License

959 stars 206 forks source link

Model overfitting when training with new data on pre-trained model #76

Closed kosmastsk closed 3 years ago

kosmastsk commented 4 years ago

I am using darknet53 as the pre-trained model and I want to adapt it to work on our data, without losing its ability to segment properly the KITTI dataset. My data is also in the format of 64x2048 and the parameters for the training have not changed. I only reduced the batch size to fit the model in my GPU.

Even though the amount of new data that I provide is significantly smaller than the KITTI dataset, the model directly overfits on the new data. While training, its training accuracy my reach even 99%. As for the validation set, I am using one sequence of my data and one sequence of KITTI and the validation accuracy is around 25%, while the predictions show that the segmentation works better for the new data and not on KITTI.

How is this normal, even though that the nea data is only around 1/40 of LiDAR frames to the initial dataset? Are there any parameters affecting this?

jbehley commented 3 years ago

Sorry, for the late reply. Just to understand your setup:

You train on SemanticKITTI and then you switch to your new data? Then the approach overfits on new data? In general, my experience with fine-tuning: With a high learning rate, you will quickly lose the already learned representation. It is generally better to reduce the learning rate.

However, you will also see that the method will not handle well different sensor configurations. Training on one configuration does not transfer well to other sensor configurations. (Even though you are using the same number of beams.) This has something to do that the distribution of the LiDAR points changes and also the resulting range image. (Which is also a reason why people are interested in domain adaption or transfer for LiDAR data).

kosmastsk commented 3 years ago

I am using the pre-trained model that you provide and then I train on my own data. Then, yes it overfits.

After some trials, I realized that the main issue was that my own data is really different than the KITTI data, due to the tilted sensor placement on the robot. If I un-tilt the data, even the pre-trained model works much better. So, I think that when the new data is totally different, it's normal to overfit in these and forget about the old data that it knew.

Right now, I am training on the downsampled KITTI dataset and my own data (un-tilted) together and the validation accuracy looks good until now.

jbehley commented 3 years ago

Thanks for your insight on the problem. Does this resolve your issue?

kosmastsk commented 3 years ago

Yes, I'm closing this issue! Thank you @jbehley