fabiotosi92 / monoResMatch-Tensorflow

Tensorflow implementation of monocular Residual Matching (monoResMatch) network.
116 stars 20 forks source link

train monoResmatch #1

Closed manuel-88 closed 5 years ago

manuel-88 commented 5 years ago

Hi, thanks for uploading the code. The training should work without proxy labels in a self-supervised manner? Meaning I should be able to run the training without the --data_path_proxy [path_cityscapes_proxy] argument. BR

fabiotosi92 commented 5 years ago

Hi, in the current version of monoResMatch you need proxy labels in order to train the network. If you don't have them, you should edit the code accordingly.

manuel-88 commented 5 years ago

Are this proxy labels somewhere available to download?

fabiotosi92 commented 5 years ago

Since proxy labels require a large amount of space, they are not available to download. However, you can use this code (https://github.com/ivankreso/stereo-vision/tree/master/reconstruction/base/rSGM) to generate them as 16 bit images in a very short time.

manuel-88 commented 5 years ago

Thank you for the information

manuel-88 commented 5 years ago

I run the SGM code you suggested. The output of the tool is a 8 Bit image. How do I convert to 16 Bit and what is the advantage?

fabiotosi92 commented 5 years ago

Hi, we use 16 bit images because proxy labels (or, in general, grountruth depth data) could encode sub-pixel information. You can save disparities using OpenCV as follows:

include <opencv2/opencv.hpp>

...

Mat disp = Mat(img.getHeight(), img.getWidth(), CV_8UC1, (uchar)img.getData()); disp.convertTo(disp, CV_16UC1); disp=256.0; imwrite(output_img, disp);

manuel-88 commented 5 years ago

Thanks for the code. But do you not lose the subpixel accuracy when casting to 8 Bit? And why do you multiply the values with 256? Then all disparities which are greater then 256 reach the 16 Bit limit.

fabiotosi92 commented 5 years ago

The rSGM code does not provide sub-pixel information, but it could be easily extended for the purpose (e.g., implementing parabola interpolation). However, in accordance with the KITTI dataset, we mantain the standard disparity values range [0..256] encoding such information as 16 bit PNG images.

manuel-88 commented 5 years ago

ok, I understand now. Thank you.