Closed jedolb closed 3 years ago
Hi,
The pretrained model is trained with the config file in the pretrained model log directory, which you download from this repo. It is called arch_cfg.yaml. I can't guarantee the result will be exactly the same because the training stops and starts multiple times usually, so I recommend you use at least 3 or 4 times the number of epochs
Thank you for your answer
So I'm going to continue the training with much more epochs and I will report my new results here to see how close I am to the pre-trained network results
Hello,
As you said, I continue to train my network. Unfortunately, the performance didn't really improve. Maybe I did something wrong or maybe you have some advice to improve the performance ?
Here are my curves when I restart the training for a second time :
A problem stopped the training, so I restart for a third time :
@jedolb How did you schedule your learning rate? I used the exact arch_cfg and data_cfg provided by the pretrained squeezeseg model, but get both train and valid performance dropping
Hi @IDl0T ,
Like you, I used the arch_cfg and data_cfg files provided by the pretrained SqueezeSeg model. I just had to change the batch size to 8 because I didn't have enough GPU memory for a bigger batch size. I didn't reach the performance of the pretrained SqueezeSeg model, just like you. Maybe there is something to change in the config files.
But then, I tried to train from scratch Darknet21 and I've reached the performance of the pretrained Darknet21 model. You can use this one too if you are interested in
I close this issue since there seems not to be much activity here or the problem resolved.
Hello !
I'm trying to train from scratch the SqueezeSeg model using your code and configuration files. I just had to change the batch size to 16 because 32 was too big for my GPU.
But at the end of the training, the performance of my network (iou=0.201) is much lower than the performance of your pre-trained SqueezeSeg (iou=0.305).
Here are my figures at the end of the 150 epochs :
Have the pre-trained models been trained with the same configuration files as in the git repo ? Should I adjust some params to reach pre-trained performances ?