CUHK-AIM-Group / EndoGaussian

EndoGaussian: Real-time Gaussian Splatting for Dynamic Endoscopic Scene Reconstruction
https://yifliu3.github.io/EndoGaussian/
MIT License
100 stars 5 forks source link

How can we get ground truth depth related to camera poses? #3

Closed guanjunwu closed 7 months ago

guanjunwu commented 7 months ago

As I know, the methods to get global ground truth depth and poses are deep-learning-methods like RCVD, and RGB-D cameras ( but i also not clear about how can get precise depth and pose). However, we cannot directly infer depth from a dynamic monocular video related to given camera poses. Do the authors have any other guidance?

yifliu3 commented 7 months ago

Hi, thanks for your attention.We use two datasets including ENDONERF and SCARED.

For ENDONERF, the authors use a da vinci robot to capture poses and a stereo-depth estimation network to predict depths as ground truth. More details of the dataset can be found in this repo:https://github.com/med-air/EndoNeRF. For SCARED, they also use a da vinci robot to capture poses, and a projector (RGBD) to get precise depth maps. Details of this dataset can be found here: https://endovissub2019-scared.grand-challenge.org/About/

To get gt depth related to a given camera pose, I think the most accurate way is to let the camera sensor be placed at that pose when capturing the data.