fabvio / Cascade-LD

Lane detection and classification in an end-to-end Deep Learning fashion
91 stars 24 forks source link

How to pretrain ERFNet #10

Closed a1anzvvy closed 3 years ago

a1anzvvy commented 3 years ago

Hey Fabvio,

Appreciate your great work here! I'm trying to reproduce the results and got some issue with the pretrained ERFNet. Correct me if I'm wrong. So we have to pretrain a pixel-wise segmentation network with 0 for background and 1 for lanes. Then we are going to replace BCE loss with the loss function you provided in this repo. But if we specify num_classes=5, we are going to have a 5-channel output at the end of Decoder. Should I add a 1 by 1 filter to get a one-channel output and then a sigmoid function after that in the first training stage? And throw 1 by filter and sigmoid away when we fine-tune the network?

fabvio commented 3 years ago

Hi, thanks for the question as it may help other people in the reimplementation. Basically, what you do is simply first training ERFNet with a conv layer with 2 output filters, then when you fine-tune you just don't load the last layer since you have size mismatch (2 vs 5). In new versions of Pytorch you can easily do this by passing strict=False to torch.load.

a1anzvvy commented 3 years ago

I see. So do we specify num_class = 5 in both phase? Or specify num_class=2 in the first training of ERFNet?

fabvio commented 3 years ago

I specified num_classes = 2 in the first training. I suggest doing this so you can reduce the bias on the previous task with a last layer trained from scratch.