train image - Githubissues

irolaina / FCRN-DepthPrediction

Deeper Depth Prediction with Fully Convolutional Residual Networks (FCRN)

BSD 2-Clause "Simplified" License

1.11k stars 313 forks source link

train image #4

Closed seokhoonboo closed 7 years ago

seokhoonboo commented 7 years ago

i am trying this paper in the caffe i am using NYU2 raw depth dataset. about 12K~13K sampled depth image. at the train task of you, did you use filtered image(cross bf or colorization) or not(raw depth image)? colorization is good filter method but it is too slow

irolaina commented 7 years ago

Hi, The ground truth depth images used for training were actually raw depth images. Invalid pixels (where depth is zero) have been then excluded from training.

seokhoonboo commented 7 years ago

sorry, I have one more question. 'relative err : 0.129' in your paper is calcualated by only testNdx datas(hole filled) in 'nyu_depth_v2_labeled.mat'(654 Image pairs)? or All test scene Images in NYU_Depth_v2_raw dataset(about 200k image pairs)? in this git hub source code, you using only testNdx datas(hole filled) in 'nyu_depth_v2_labeled.mat'(654 Image pairs) for test.

irolaina commented 7 years ago

The error metrics are calculated over the official split of 654 images in the labeled subset of NYU (to fairly compare to prior works). In this case, we are using the depth maps which were filled-in using the colorization method and not the raw data. The errors should be lower when comparing to raw depth maps.

jszhujun2010 commented 7 years ago

Hi, @seokhoonboo ! Have you successfully reimplement it in caffe? I have got some troubles, like how to set upsampling layer(since there is no direct upsampling layer like paper in caffe, instead I use deconvolution layer to replace upsampling, but it seems that the network is hard to learn anything...).

bazilas commented 7 years ago

@jszhujun2010 it might be easier to implement upsampling in caffe. Training this network with deconv will be problematic. However, you might want to try not to learn the parameters of the deconv and only use it as a bilinear filter.

jszhujun2010 commented 7 years ago

Thanks, @bazilas . I'll consider to write my own unsampling layer.

seokhoonboo commented 7 years ago

@jszhujun2010 https://github.com/matthieudelaro/caffeBVLCplus this repository include unpooling layer, and i am trying by using it

jszhujun2010 commented 7 years ago

Thanks, @seokhoonboo . I guess UnpoolingParameter_UnpoolMethod_MAX is what we need.

jszhujun2010 commented 7 years ago

Hey, guys! I'm trying to train a model on Make3D dataset, here is details:

I resized all images to 172*230 as CNN's input(since the paper mentioned that all images are resized to 345*460 and then reduce the resolution by half), and augmented the data to 15k. I resized output data(305*55) to 172*230.
The CNN architecture is just as the paper's description except some minor changes. Since 4 upconv structure can not get 172*230, so I add another upconv layer(resulting 192*256) and a crop layer(as FCN segmentation).
I set leaning rate as 10^(-10) because larger lr will cause infinite loss. Other parameters are exactly the same as paper.

However, there is no signal of convergence after thousands of iterations. Is there anything I'm missing or wrong?

jszhujun2010 commented 7 years ago

Hi! @iro-cp , I have one more question. Did you your fine-tune the whole network or just the part after the resnet?

irolaina commented 7 years ago

@jszhujun2010 Regarding fine-tuning: All layers are trained for the depth estimation task, including both the core-ResNet and the upsampling part.

To answer your previous question: When you do the data augmentation you should make sure that you do not interpolate depth values that shouldn't be interpolated (for example when the ground truth depth map includes invalid pixels). In our work, all transformations are performed with nearest neighbor interpolation. I think the lack of convergence is related to your very low learning rate. If you had to set it this low to avoid an infinite loss, you might want to make sure that there is everything is fine with your training data for example.

jszhujun2010 commented 7 years ago

@iro-cp Thanks for your reply. It's still for me to train the model, the model is even hard to overfit(using two images with batch size 2, original data without transformation). I have forked the repository and added my caffe prototxt as well as some preprocessing scripts. You guys can check it if you are interested. Repo is here, and caffe prototxt is in caffe folder.

cv-small-snails commented 7 years ago

Hi @jszhujun2010 Do you train your model on NYU Depth v2 dataset successfully?

jszhujun2010 commented 7 years ago

kind of... I have trained on NYU data without data agumention and the model can fit training data well(testing result not so good though, due to overfitting).

cv-small-snails commented 7 years ago

Hi @jszhujun2010 I have trained your model in the repository on NYU Depth v2 dataset, but I found the loss also can't converge. The Loss map can be seen here.I don't why ? Is there any question? untitled

jszhujun2010 commented 7 years ago

I have modified the network and I'll update it recently.

cv-small-snails commented 7 years ago

@jszhujun2010 Thanks for your reply. I have trained your modified network on NYU Depth v2 dataset again. But I found the loss is still as shown above and can't converge.Is it normal?Is there any question?

jszhujun2010 commented 7 years ago

Well, it's fine for me. How many iterations is your setting?

cv-small-snails commented 7 years ago

@jszhujun2010 Thanks for your reply. I have set 300000 iterations on NYU Depth v2 dataset.But the loss still can't converge.And the Loss map still didn't like the map as you show! Do you change anything of your network?I don't why ?

jszhujun2010 commented 7 years ago

What's your data? I only used 795 training images from official dataset(as a result, overfitting is quite obvious). I found that their training data is sampled from video, I'm too lazy to download that large file... Another thing is that, I still cannot reimplement make3D dataset's experiment even in training(training loss can not decrease to ideal level). I guess there must be some trick in data preprocessing phase, but I still cannot figure it out... I think make3D is much more difficult to train because its depth range is too large for the network to learn.

chrirupp commented 7 years ago

looks like all problems are resolved. Please open another issue if there are open questions.

Lvhhhh commented 7 years ago

can you share the train code?

buaaswf commented 6 years ago

Could you share the pre-processing code to the make3d dataset? I can not find how to augment the training dataset from 343 to 15k， thank you