About the monocular video data set with the camera not moving

ingra14m / Deformable-3D-Gaussians

[CVPR 2024] Official implementation of "Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction"

https://ingra14m.github.io/Deformable-Gaussians/

MIT License

884 stars 49 forks source link

About the monocular video data set with the camera not moving #36

Open haloann666 opened 7 months ago

haloann666 commented 7 months ago

hello！ The IPER data set is a monocular video data set. The camera is facing the person and the camera is not moving. How to process this data set so that it can be run in this project? IPER 数据集是单目视频数据集，镜头正对着人，镜头不动，请问如何处理这个数据集才能在该项目中运行呢？十分感谢！！

ingra14m commented 7 months ago

Hi, thanks for your interest.

I think our approach as well as vanilla 3D Gaussian cannot handle this type of dataset. Our monocular refers to only one viewpoint at the same time, but the entire process needs to be multi-view. The fact that the camera remains fixed throughout the process implies single-view learning, and 3D-GS cannot handle few-shot learning tasks.

DingChunQ commented 2 months ago

Hi, thanks for your interest.

I think our approach as well as vanilla 3D Gaussian cannot handle this type of dataset. Our monocular refers to only one viewpoint at the same time, but the entire process needs to be multi-view. The fact that the camera remains fixed throughout the process implies single-view learning, and 3D-GS cannot handle few-shot learning tasks.

Hello, is there any work that can process such videos?

Or lower the requirements a little bit, can we get dynamic 3DGS from a monocular video with a fixed view angle, which can represent the 3D structure close to the camera's view angle without considering the position that the camera's view angle does not cover enough?

ingra14m commented 2 months ago

Hi @wangyancongAC, I believe that a monocular video with a fixed view angle you mentioned is a much more difficult situation. It is strict monocular dataset, which tends to over-fitting to the given viewpoint and fails to model the dynamic scene properly.

As for this case, I think you can take some ideas and insights from DyCheck and Shape-of-Motion.

DingChunQ commented 2 months ago

Hi @wangyancongAC, I believe that a monocular video with a fixed view angle you mentioned is a much more difficult situation. It is strict monocular dataset, which tends to over-fitting to the given viewpoint and fails to model the dynamic scene properly.

As for this case, I think you can take some ideas and insights from DyCheck and Shape-of-Motion.

Thank you very much!