Performance of SqueezeSeg trained from scratch much lower than performance of the pre-trained model

PRBonn / lidar-bonnetal

Semantic and Instance Segmentation of LiDAR point clouds for autonomous driving

http://semantic-kitti.org

MIT License

961 stars 206 forks source link

Performance of SqueezeSeg trained from scratch much lower than performance of the pre-trained model #51

Closed jedolb closed 3 years ago

jedolb commented 4 years ago

Hello !

I'm trying to train from scratch the SqueezeSeg model using your code and configuration files. I just had to change the batch size to 16 because 32 was too big for my GPU.

But at the end of the training, the performance of my network (iou=0.201) is much lower than the performance of your pre-trained SqueezeSeg (iou=0.305).

Here are my figures at the end of the 150 epochs :

train_iou_our_squeezeseg

valid_iou_our_squeezeseg

Have the pre-trained models been trained with the same configuration files as in the git repo ? Should I adjust some params to reach pre-trained performances ?

tano297 commented 4 years ago

Hi,

The pretrained model is trained with the config file in the pretrained model log directory, which you download from this repo. It is called arch_cfg.yaml. I can't guarantee the result will be exactly the same because the training stops and starts multiple times usually, so I recommend you use at least 3 or 4 times the number of epochs

jedolb commented 4 years ago

Thank you for your answer

So I'm going to continue the training with much more epochs and I will report my new results here to see how close I am to the pre-trained network results

jedolb commented 4 years ago

Hello,

As you said, I continue to train my network. Unfortunately, the performance didn't really improve. Maybe I did something wrong or maybe you have some advice to improve the performance ?

Here are my curves when I restart the training for a second time :

2_train_iou 2_valid_iou

A problem stopped the training, so I restart for a third time :

3_train_iou 3_valid_iou

lunw1024 commented 4 years ago

@jedolb How did you schedule your learning rate? I used the exact arch_cfg and data_cfg provided by the pretrained squeezeseg model, but get both train and valid performance dropping

jedolb commented 4 years ago

Hi @IDl0T ,

Like you, I used the arch_cfg and data_cfg files provided by the pretrained SqueezeSeg model. I just had to change the batch size to 8 because I didn't have enough GPU memory for a bigger batch size. I didn't reach the performance of the pretrained SqueezeSeg model, just like you. Maybe there is something to change in the config files.

But then, I tried to train from scratch Darknet21 and I've reached the performance of the pretrained Darknet21 model. You can use this one too if you are interested in

jbehley commented 3 years ago

I close this issue since there seems not to be much activity here or the problem resolved.