CUHK-AIM-Group / EndoGaussian

EndoGaussian: Real-time Gaussian Splatting for Dynamic Endoscopic Scene Reconstruction
https://yifliu3.github.io/EndoGaussian/
MIT License
100 stars 5 forks source link

depth supervision is not working #2

Closed ysjue closed 7 months ago

ysjue commented 7 months ago

Hi, thanks for making this work available! After I set up the environment following the README and experimented with your code, I found no matter how I changed the weight of depth loss, the algorithm can always reach pretty close photometric performances, even when the weight is 0. Can I ask which branch of depth-diff-gaussian-rasterization you used in your code? Also, would you mind letting me know if you customized anything in the original submodule? I tried your code with [depth-diff-gaussian-rasterization] (https://github.com/ingra14m/depth-diff-gaussian-rasterization/tree/depth) to enable depth supervision, but it didn't work as expected. Look forward to your advice. Thanks!

ingra14m commented 7 months ago

Hi, I am the author of depth-diff-gaussian-rasterization. Currently, this repository uses my main branch, which means it only has the depth forward pass.

yifliu3 commented 7 months ago

Hi, thanks for your attention and check of the branch used in this repo @ingra14m . I just found the problem thanks to the reminder of @ysjue. In my current implementation, depth supervision is in fact not imposed, and it seems that solely using rendering constraints can produce a rather promising result. We will use the depth branch in https://github.com/ingra14m/depth-diff-gaussian-rasterization/tree/depth later, and check if the results could be further improved.

ingra14m commented 7 months ago

Hi, from my experimental results, if 3D-GS can achieve good results through the render loss itself, then the role of the depth loss is negligible, and it may even lead to negative optimization. This is because the geometry of 3D-GS is not aligned with the the real world. This is also one of the reasons why I released the depth backward pass [\doge].

yifliu3 commented 7 months ago

Thanks for your reminder @ingra14m. I replace the original main branch with the depth branch that supports depth backward, and as we expected, negative optimization problems happened.

Considering relative depth between rendering and real depths should be consistent, I normalize both kinds of depths maps by dividing their maximum values, and observe normal convergence.

After that, I conduct experiments on the pulling clip of ENDONERF dataset and find that with or without depth loss bring very similar results, 35.186 PSNR and 35.089 PSNR respectively, revealing that depth supervision is in fact not that important. I suppose this is because depth information has been used during the Gaussian initialization stage, thus extra depth supervision may only introduce marginal or no improvement.

I have updated the code in this repo. Thanks for your great question @ysjue.

ysjue commented 7 months ago

Thanks for your responsive reply and help! @ingra14m @yifliu3 I observed the geometry fidelity (depth) of GS is not as great as its photometric performance (i.e., PSNR, SSIM) for static views like EndoNeRF. If nothing went wrong, my implementation led to a depth RMSE of around 6mm, which might be a potential limitation of GS, especially in clinical applications.

yifliu3 commented 7 months ago

Thanks for your work for the depth evaluation, which could inspire following work directions. I also implemented the RMSE depth evaluation code and noticed serveral factors that hinder the depth optimization:

  1. Some depth values like 65535 are beyond normal values, so we use a clip function to regularize the gt depth.
  2. Some depth maps are missing for the ENDONERF dataset, we omit those blank depth maps during the optimization. The missing depth of the data might be caused by incorrect data downloading. I will check it later.
  3. Hyperparameter tuning, especially for depth loss weight.

After changments above, we get depth RMSE of around 2.78mm for the pulling clip of ENDONERF, as shown below:

image

Yet this result may still be not satisfactory for some applications, thus we would explore how to improve the surface modeling in future works, perhaps achieved by imposing more surface contraints, like SDF field in EndoSurf.

I have updated depth evaluation codes in the repo. Thanks for your great observations.