CVLAB-Unibo / Real-time-self-adaptive-deep-stereo

Code for "Real-time self-adaptive deep stereo" - CVPR 2019 (ORAL)
Apache License 2.0
420 stars 73 forks source link

Close objects to Camera and illumination saturation #38

Closed YJonmo closed 4 years ago

YJonmo commented 4 years ago

Thanks again for this work.

My first question is that is there any parameter that I could adjust to be able to see the objects which are very close to the camera? In SGM you can change the Maximum Disparity to be able see the close objects.

I noticed the objects which are illuminated close to saturation tend to have a wrong depth estimation. This happens even when the object is still not saturated but is close to saturation. Below shows two images from the same spherical object with two illumination conditions. The upper row where the object is close to saturation, the depth appears flat, whereas the lower row with no saturation the depth appears correct. Is there any way to deal with this besides having the illumination fixed? Screenshot from 2020-01-09 13-06-14

AlessioTonioni commented 4 years ago

As for the maximum disparity, unfortunately, it is a hyperparameter of the models, therefore to be able to change it you will need to retrain the two models.

As for the performance with saturated images, it is expected since if an image patch is saturated it will be very hard to find the correct match between the two views since every pixel will have the same intensity value. Unfortunately starting from a saturated image is very hard to go back to the original unsaturated image.

YJonmo commented 4 years ago

Thanks, but could you tell me which hyper parameter it is for so that i could retrain? I am asking this question because I work with endoscopic images where the objects could be very close to the cameras.

AlessioTonioni commented 4 years ago

For DispNet it is quite easy, you can change the value of this constant https://github.com/CVLAB-Unibo/Real-time-self-adaptive-deep-stereo/blob/924cb3aa414483240f080b869b9dda79dc71b4ff/Nets/DispNet.py#L7 which will be the value of the maximum disparity at a quarter resolution (i.e. this Dispnet computes a cost volume with a maximum disparity of 160). If you plan to increase it keep in mind that the memory usage will slightly increase as well.

For MADNet it is more tricky as the network builds multiple cost volumes at different resolutions. In general this parameter: https://github.com/CVLAB-Unibo/Real-time-self-adaptive-deep-stereo/blob/924cb3aa414483240f080b869b9dda79dc71b4ff/Nets/MadNet.py#L46 controls the dimension of the search window used at the different resolutions to computes feature correlation. By increasing it the network should handle better high disparity values, but again this comes at the cost of an higher memory footprint

YJonmo commented 4 years ago

Thanks for the info. I was guessing that the MAX_DISP is the one that I should change, but I was playing with the MAX_DISP of the Stereo_Online_Adaptation.py and Train.py.

I still don't know how changing the MAX_DISP of these two code will affect the result.

AlessioTonioni commented 4 years ago

The MAX_DISP in Train.py is used to clip the maximum gt disparity to that value when training. The MAX_DISP in Stereo_Online_Adaptation.py is used only to normalize the predictions before saving them to disk.