CUHK-AIM-Group / EndoGaussian

EndoGaussian: Real-time Gaussian Splatting for Dynamic Endoscopic Scene Reconstruction
https://yifliu3.github.io/EndoGaussian/
MIT License
115 stars 6 forks source link

How can we get ground truth depth related to camera poses? #3

Closed guanjunwu closed 10 months ago

guanjunwu commented 10 months ago

As I know, the methods to get global ground truth depth and poses are deep-learning-methods like RCVD, and RGB-D cameras ( but i also not clear about how can get precise depth and pose). However, we cannot directly infer depth from a dynamic monocular video related to given camera poses. Do the authors have any other guidance?

yifliu3 commented 10 months ago

Hi, thanks for your attention.We use two datasets including ENDONERF and SCARED.

For ENDONERF, the authors use a da vinci robot to capture poses and a stereo-depth estimation network to predict depths as ground truth. More details of the dataset can be found in this repo:https://github.com/med-air/EndoNeRF. For SCARED, they also use a da vinci robot to capture poses, and a projector (RGBD) to get precise depth maps. Details of this dataset can be found here: https://endovissub2019-scared.grand-challenge.org/About/

To get gt depth related to a given camera pose, I think the most accurate way is to let the camera sensor be placed at that pose when capturing the data.