alexgkendall / SegNet-Tutorial

Files for a tutorial to train SegNet for road scenes using the CamVid dataset
http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html
851 stars 518 forks source link

Different image sizes #6

Closed Vaan5 closed 8 years ago

Vaan5 commented 8 years ago

Hello, i am trying to run the provided tutorial. My graphics card doesn't have so much RAM (2GB), so i cant run the tutorial on images of size 360x480. Because of that i reshaped them to size of 120x240, changed the upsampling params in segnet_basic prototxt files, and trained the net using caffe. I picked a model from the iteration achieving around 90% accuracy, ad the same per class accuracy (most classes dont have that high accuracy but most of them are >= 70%).

After that i tried to run the two provided scripts (changed paths accordingly and in the compute_bn script i also changed in_h and in_w to 120 and 240). But when i run the second script, i get totaly random output (although i ran everything according to the tutorial).

Does this mean that the provided code only works for images of size 360x480 or am i doing something wrong?

When running the test_segmentation_camvid.py script i ran it on the training set also (so changed the data source to the train.txt file), and it gave me the same noisy output

alexgkendall commented 8 years ago

Hey. No the network should work for any reasonable input size. I've tried as little as 224x224 and as large as 640x480, all with success. 120x240 should work too - and your model converged which is promising. Although I'm not sure at first glance what is wrong here.

Vaan5 commented 8 years ago

Okay,so finally (after 4 days XD) I found what was the problem. I installed caffe and the python wrapper on windows but it looks like the wrapper doesn't work well (or more likely i have done something wrong while building it on my OS). The call to caffe.net(prototxtFile, trainedModelParams, mode) doesn't load the parameters into the net (in my case), so that's why i got such output (basically the net was initialized with random values - according to the fillers).

To avoid that i needed to put together a simple python script which will train the net with caffe.solver, then get the net with solver.net; and by using net.share_with(other_net) i could share the learned parameters with other nets (the one used for calculating the bn parameters as well the final inference network).

I haven't had the time yet to try it on the whole dataset. I just picked one image from train set, trained it for 120 iterations by using the solver (got around 0.94 accuracy). Then by using the before mentioned methods i got an output image that is the same as the ground truth. (accuracy - gotten with sklearns accuracy_score also around 0.94). The downside of this was that the memory demand on GPU is quite big (for my gpu atleast). To avoid that i will later try out pickling the parameters...

Thank you for your quick answer. I'm sorry to have bothered you.

alexgkendall commented 8 years ago

Sounds like a mad work around - but hey great to know you could get it to work on Windows! Cheers

rshanor commented 8 years ago

Hey @Vaan5 , do you have any advice on correctly setting the upsample parameters in the prototext file while trying to train on smaller images? At first, I was getting the error "bottom[0]->height() == bottom[1]->height()" so I modified the upsample_w and upsample_h parameters in the upsample layer. After tweaking these, my network starts to train, but loss does not converge (jumps between 30 and NAN with batch size 3...) Thanks for the advice!

montekristo1946 commented 8 years ago

Vaan5 - Please tell me how you are going segnet under windows?

angy50 commented 8 years ago

@Vaan5 , How did you installed segnet on windows in order to give the caffe-root path in test_segmentation_camvid.py ?

montekristo1946 commented 8 years ago

I used to c++ implementation, compiled for vs 2013.