cleinc / bts

From Big to Small: Multi-Scale Local Planar Guidance for Monocular Depth Estimation
GNU General Public License v3.0
628 stars 177 forks source link

Images From Same Scene With Different Field of View #22

Open MACILLAS opened 4 years ago

MACILLAS commented 4 years ago

Hello, I've been doing some testing with your eigen model and I ran into an issue that I hope you could help me with.

I have two images, of different field of view, from the same scene... I've also calculated their respective focal length (pixels). (90 Degrees FOV, focal length = 620.5 pixels) GOPRO022_1095_90FOV_CROP (120 Degrees FOV, focal length = 385.25 pixels) GOPRO022_1095_120FOV_CROP

I only made the following modification to the configurations... 1095_90FOV_CROP.jpg None 620.5 1095_120FOV_CROP.jpg None 385.25

I sampled the resulting depth map and found the following depths... 90FOV 120FOV

Shouldn't these two depths be the same?

Additionally, I changed the focal lengths to some arbitrary number (ie 100) and the depth values remained unchanged.

Is the focal length not used in testing? What am I missing?

Thanks in Advance!

cogaplex-bts commented 4 years ago

Thanks for sharing this interesting issue. We use the focal length in "patch-wise" pixel coordinate normalization in Equation (1). Therefore, resulting depth estimation should vary wrt. the given focal length but the difference can be minimal. One interesting point in your testing is your images have different fov according to the focal length. That is, with shorter focal length, it seems the scene is further. For example, your two focal lengths are 620.5 and 385.25 and there ratio is 1.61. Your depth estimation values are 8.45 and 13.49 and their (inversed) ratio is 1.60. Therefore, we can think that the model predicts proper values according to its trainig setting (images in KITTI dataset typically have focal length around 720). Finally, if you want to get depth estimation with your own images but using our KITTI Eigen model you have two options: manipulate your images to have focal length around 720 or scale your depth estimation with a ratio between your own focal length and 720.

MACILLAS commented 4 years ago

Thank you very much for reviewing my issue! I agree with your assessment. I will scale my depth estimation for the time being.

amrbenattia commented 4 years ago

@MACILLAS I wonder how did you test on this size 375x1241? it is different from eigen size ? I believe that the width and height shoul be multiplications of 1,2,4,8,16,32 please correct me? thanks in advance

MACILLAS commented 4 years ago

Hey @amrbenattia , My understanding is that the images are automatically formatted to 352x1216... I did try to feeding the network a 2400x2400 image one time and the output depth map is still 352x1216, but the depth prediction was really bad (maybe it has something to do with aspect ratio or something).

shkr commented 3 years ago

@MACILLAS @cogaplex-bts Any update on which input image size to feed the network?