TiagoCortinhal / SalsaNext

Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving
MIT License
417 stars 102 forks source link

Size mismatch between pretrained model and current model? #68

Closed finnSartoris closed 2 years ago

finnSartoris commented 2 years ago

When I try to use the pretrained model I get the error message below. I have not changed anything on the model but apparently the two do not fit together. Has the model been changed? The only thing I changed in the pretrained model is the name of the modules, as described in https://github.com/Halmstad-University/SalsaNext/issues/65 Thanks a lot! Bildschirmfoto 2021-12-20 um 16 14 35

TiagoCortinhal commented 2 years ago

Hey!

Just a quick question, do you have more classes with the setup you are testing? If so that is the issue (you need to change the logits from the pretrained to one that fits your number of classes), if not I will check it tomorrow as I am not in front of a computer were I can test this myself.

finnSartoris commented 2 years ago

Thanks for the quick answer! Yes that is the case. But how can I change the logits from the pretrained model?

TiagoCortinhal commented 2 years ago

I would load it with the load the model as is, then access the last layer of the model and change it to be one with the correct number of classes (46 in your case): something like self.model.logits = nn.Conv2d(...).

After the first model save you can revert to the code to before the changes above as the new saved model already has the 46 classes and everything will work nicely.

finnSartoris commented 2 years ago

But how would you then determine the weights of the new last layer with the 46 classes? Unlike the remaining weights of the pretrained model, these would be new and therefore not yet determined. Should I first train only the last layer and freeze the rest or how would you proceed?

TiagoCortinhal commented 2 years ago

The way you choose to finetune tune will depend on you. You might want to frozen the other layers or have a very reduced LR and train it all together. Without testing it I cannot say which approach is better.