aipixel / GPS-Gaussian

[CVPR 2024 Highlight] The official repo for “GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis”
https://shunyuanzheng.github.io/GPS-Gaussian
MIT License
461 stars 25 forks source link

About the stability of the generated novel view video pictures #50

Open shangshang98 opened 1 month ago

shangshang98 commented 1 month ago

Thanks for your excellent work, and I have a problem. When I generated the real-time novel view video, the edges of the people and clothes were shaking, even for the still human. I guess it is because the depth estimation is unstable. How can I solve this problem to reduce the edge shake? Thanks for your help!

ShunyuanZheng commented 1 month ago

Hi, can you upload some novel view video results? The unstable depth estimation will not obviously affect the edge, especially in full-body setup. I think the quality of matting is a more important factor.

shangshang98 commented 1 month ago

Based on your answer I observed the mask videos and found that the edge shake was not noticeable, but the edges shaking of the people and clothes still existed. When I increase the spacing between the two cameras, the shake get worse. How to explain this phenomenon? Is it because the depth estimate is unstable at the edge of the human? And how can I solve this problem? Thanks for your help!

ShunyuanZheng commented 1 month ago

Do you train another model when you increase the baseline of the source cameras? The training data should be identical to the real-world test data. I think the depth estimation is robust under 16 camera settings (22.5°). However, under 8 camera settings, the result of either the depth estimation or the rendered novel views should be unstable. I am still confused since the unstable results I mentioned will not only occur at the edge but at the whole body. Also, it will be easier to determine the reason if you can provide some video results.