Closed a1anzvvy closed 3 years ago
Hi, thanks for the question as it may help other people in the reimplementation. Basically, what you do is simply first training ERFNet with a conv layer with 2 output filters, then when you fine-tune you just don't load the last layer since you have size mismatch (2 vs 5). In new versions of Pytorch you can easily do this by passing strict=False
to torch.load
.
I see. So do we specify num_class = 5 in both phase? Or specify num_class=2 in the first training of ERFNet?
I specified num_classes = 2
in the first training. I suggest doing this so you can reduce the bias on the previous task with a last layer trained from scratch.
Hey Fabvio,
Appreciate your great work here! I'm trying to reproduce the results and got some issue with the pretrained ERFNet. Correct me if I'm wrong. So we have to pretrain a pixel-wise segmentation network with 0 for background and 1 for lanes. Then we are going to replace BCE loss with the loss function you provided in this repo. But if we specify num_classes=5, we are going to have a 5-channel output at the end of Decoder. Should I add a 1 by 1 filter to get a one-channel output and then a sigmoid function after that in the first training stage? And throw 1 by filter and sigmoid away when we fine-tune the network?