Nightmare-n / DepthAnyVideo

Depth Any Video with Scalable Synthetic Data
https://depthanyvideo.github.io
Apache License 2.0
408 stars 27 forks source link

disparity or depth #7

Closed Xiang-cd closed 1 month ago

Xiang-cd commented 1 month ago

thank you for your great work! see the inference code, the ouput latent finally decode into disparity space, but the paper says: "Specifically, given a video depth xd, we first apply a normalization as in Ke et al. (2024) to ensure that depth values fall primarily within the VAE's input range of [−1, 1]" so what is the real?

Nightmare-n commented 1 month ago

Thank you for your question! In our implementation, we actually use disparity instead of depth.

nekoshadow1 commented 1 month ago

Thank you for your question! In our implementation, we actually use disparity instead of depth.

I know that the relationship between disparity (d), depth (Z), focal length (f), and baseline distance (B) is given by the formula: Z = fB/d

f: Focal length of the camera B: Baseline distance between the two cameras d: Disparity

Therefore, is it correct to convert disparity map to depth map by running the following code? Thank you in advance!

import cv2
d = cv2.imread(PATH_TO_DISPARITY_IMAGE, 0) #disparity map is 3-channel so I need to read the image as 1-channel.
depth = f * B / d
Nightmare-n commented 1 month ago

A reference for converting disparity to depth can be found here.