udaykusupati / Normal-Assisted-Stereo

[CVPR 2020] Normal Assisted Stereo Depth Estimation
https://udaykusupati.github.io/NAS
MIT License
111 stars 20 forks source link

Confusing conversion between disparity and depth. #4

Closed MurrayC7 closed 4 years ago

MurrayC7 commented 4 years ago

Hello, thanks for opening the excellent work to the public! When using the code, I have confused about the conversion between disparity and depth in the code.

  1. For sceneflow, firstly its disparity is converted to the depth at data_loader.py. Then, the depth is corrected at train_sflow.py because of different focal length, where my confusion is. Could you please help me to understand the meaning of (args.mindepth*factor)*(args.nlabel*3)/disp, or the meaning of minidepth and nlabel? Similar conversion is likely at MVDNet.py. Since I only know the equation depth = focal_length * baseline / disparity, where the focal_length is 450 or 1050 units from the intrinsics and the baseline is 1.0 Blender units from sceneflow website, but minidepth is 5.45 and nlabel is 64.

  2. For KITTI, its disparity seems only to be divided by 256.0 data_loader.py, and I can't find any else conversions to the depth. Could you please tell me this difference?

Thank you!

Best Wishes

udaykusupati commented 4 years ago

For sceneflow, the dataset contains instances with different focal lengths 450/1050. Because we compare our work with GANet, we set the max-disparity to 192, but use 64 equally spaced levels (since we have memory constraints), so mindepth = f/192, but since different instances have different focal lengths (the majority having 1050), we set f=1050. During train time, when an instance with f = 450 arrives, we need to adjust the mindepth, depending on the intrinsics, which the factor takes care of. Finally, args.nlabel*3 is just 192.

MurrayC7 commented 4 years ago

Thanks for the clear explanation! For KITTI, I wonder what constant you use. By the way, there are different intrinsics in KITTI instances, but it seems there isn't a use of intrinsics like that use in sceneflow. Maybe, do you mean the depths are converted before loading data?

udaykusupati commented 4 years ago

Sorry KITTI does have different intrinsics for different instances. I stand corrected. The train file should be modified exactly similar to sceneflow, where the factor will take care of the appropriate conversion. For both KITTI datasets, I set mindepth = 2.029 and use

factor = (1.0/args.scale)*intrinsics_var[:,0,0]/721.5377

I didn't include KITTI training in the code released so as to not increase confusion, but these changes should help you train KITTI.

MurrayC7 commented 4 years ago

Okay, I see. Thanks a million for clarification!