How much time it takes to train encoder for Cityscapes data

TimoSaemann / ENet

ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation

577 stars 276 forks source link

How much time it takes to train encoder for Cityscapes data #47

Open bhatshashank opened 6 years ago

bhatshashank commented 6 years ago

Hi all,

I am doing semantic segmentation for first time. I had kept cityscapes data to train but the accuracy is 0 throughout & loss is constant 87 with lr = 5e-06. It has been almost 18 hrs on GTX 980 Ti GPU with 6 GB RAM. To be frank I am not training whole of cityscapes data but only images from aachen folder. I have followed all the instructions mentioned in Github page to run this. I have not changed any of the default settings.

Can anyone guide me to train encoder? I just want to complete the flow of semantic segmentation training as early as possible, So that I can train on my own data set.

ewen1024 commented 6 years ago

@bhatshashank You should stop if you saw the accuracy was 0 after one hour. There must be something wrong. If you only changed the pic folder, make the txt file in the encoder.proyotxt has which images you want to have. And as I am training on my own dataset from raw.
Starting training from encoder-decoder directly ( no need for encoder training at first) 2 X speed change the dropout to caffe dropout 2 X speed On my gtx 1080ti , 2600 pics 640 480, 10 hours for iteration 120000 batch size :3 Hope it helps

ShashankVBhat commented 6 years ago

@ewen1024

Thank you. I dint know that we can directly train encoder-decoder part without training encoder.

ghost commented 6 years ago

Hi @ewen1024 , I just started using ENet in caffe. Just wondering if we can train encoder-decoder without training encoder at beginning, does that mean I can train the whole network on Cityscapes first, get a .caffemodel, then use this weights to fine-tune on my own dataset? I mean fine-tune the whole encoder-decoder together?

ewen1024 commented 6 years ago

@wzhouuu yeah, directly run the training of encoder-decoder works for me.

ghost commented 6 years ago

thanks @ewen1024! I started training, and realized this caffe version doesn't take validation set like Torch implementation. It saves weights every 10000 iterations. I'm not convinced the later iteration is always better than the previous ones. How did you select the best model if the code doesn't have validation process? Thanks for your help!

Li-Lai commented 6 years ago

@ewen1024 Hello, there are some questions to ask for your help. A. if segment one class, is the output number set to 1 or 2（add background）in deconv6 layer?
B. if i test my own data , do i need to make colours label png like cityscapes1.png?