pablovela5620 / monoprior

29 stars 0 forks source link

Some questions about relative depth to pcd? #1

Open shuiyued opened 1 month ago

shuiyued commented 1 month ago

Hello, this library is very nice. However, I would like to know how you reconstruct the 3D point cloud from relative depth, like in DepthAnything, without distortion.

pablovela5620 commented 1 month ago

I basically have to convert the disparity to depth making a few assumptions, so I have to guess at a focal length https://github.com/pablovela5620/monoprior/blob/2c6c8e026a1fa6659cbe1b9ea8ad07ebf08abdfe/monopriors/depth_utils.py#L6

and baseline and then use https://github.com/pablovela5620/monoprior/blob/2c6c8e026a1fa6659cbe1b9ea8ad07ebf08abdfe/monopriors/depth_utils.py#L22

Once I have convert disparity to depth I use this function to convert into a 3d point cloud https://github.com/pablovela5620/monoprior/blob/2c6c8e026a1fa6659cbe1b9ea8ad07ebf08abdfe/monopriors/depth_utils.py#L62

shuiyued commented 1 month ago

Thanks for your reply! I tested some images and found that a field of view (FOV) of 55 degrees is almost effective.

pablovela5620 commented 1 month ago

Yea its definintely not foolproof, there will be image where the assumed 55 FOV will not work. You can try something like dust3r to attempt to estimate the camera intrinsic but having accurate camera intrinsic will give you a much better point cloud. There's also the fact that the relative depth models will still produce bad depth maps for certain images

shuiyued commented 4 weeks ago

Yes, but is the dust3r adaptive for multiple images? The depthanything and metric3d are depth models designed for single images. To be honest, I think there will be little improvement in relative depth for depthanythingv2 from a single image.

shuiyued commented 4 weeks ago

I misunderstood your meaning. The dust3r attempts to estimate the camera intrinsics, which will result in a better point cloud. I predict that using a single image to estimate camera intrinsics for a metric camera will be unreliable.

pablovela5620 commented 1 week ago

right, those models don't take into account the camera intrinsic, where as something like Unidepth or Dust3r estimate the camera intrinsic leading to better results