TimoSaemann / ENet

ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation
584 stars 275 forks source link

Finetuning from cityscapes caffemodel #15

Open nk-dev0 opened 7 years ago

nk-dev0 commented 7 years ago

Thanks for this useful code. I'm trying to finetune the encoder-decoder model using cityscapes_weights.caffemodel. My own data is 512x512 RGB with 2 classes, so I think maybe this size discrepancy is causing the following error (after initializing from prototxt):

I0707 13:02:04.584967 59686 net.cpp:283] Network initialization done.
I0707 13:02:04.586356 59686 solver.cpp:60] Solver scaffolding done.
I0707 13:02:04.612700 59686 caffe.cpp:155] Finetuning from /media/borges/pl/enet/weights/cityscapes_weights.caffemodel
F0707 13:02:04.719262 59686 net.cpp:767] Check failed: target_blobs.size() == source_layer.blobs_size() (1 vs. 2) Incompatible number of blobs for layer conv1_0_0

Do you have any insight into what might be causing this? I'm using all of the default settings other than commenting the resize params in the input blob, changing the number of classes in the deconvolution layer from 19 to 2, and adding class frequency weighting in the softmax layer.

TimoSaemann commented 7 years ago

Could you try it without comment the resize params? new_height: 512 new_width: 512

TimoSaemann commented 7 years ago

The _cityscapesweights.caffemodel has changed blob size, because bn layers are merged. For finetuning you need the weights before the bn layer were merged. I have uploaded this weights (cityscapes_weights_before_bn_merge.caffemodel).

nk-dev0 commented 7 years ago

Hi, thanks for uploading the new weights. I can successfully train with them to high accuracy, but after I go through the BN computing and BN absorbing steps as outlined in the tutorial, my prediction images are all 1...

I trained the encoder-decoder network from scratch and followed the tutorial and got meaningful results, so I think there's something with the way I'm implementing the bn merge. Is the cityscapes_weights_before_bn_merge.caffemodel trained just using the normal encoder-decoder prototxt?

fyushan commented 7 years ago

Hi, nkral. what is your accuracy on dataset 512x512 RGB with 2 classes? could you please share your weigh.caffemodel, and train.prototxt? I finetuned enet_decoder from cityscapes_weights_before_bn_merge.caffemodel on cityscapes data containing 19 classes, I got average 80% accuarcy, not IoU. I think this accuarcy is too low.