SJoJoK / 3DGStream

[CVPR 2024 Highlight] Official repository for the paper "3DGStream: On-the-fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos".
https://sjojok.github.io/3dgstream
MIT License
331 stars 22 forks source link

How to get and export a 3d dynamic scene from multiple videos? #2

Closed adrida closed 7 months ago

adrida commented 7 months ago

Hi,

Thank you for the great work. I was wondering if your approach allowed to take different videos from the same scene and then generate in real time the 3d version of this dynamic scene? The idea would be to pass as input a few videos from the same scene and get (almost in real time) a blender/unity/unreal dynamic asset that could even be replayed if wished.

If not, do you know what kind of terms I should look for when digging in the literature?

Thanks in advance for your help!

SJoJoK commented 7 months ago

Hello, Thanks for your interest in our work! Our work is able to "pass as input a few videos from the same scene and get a dynamic asset", but with limitations:

  1. The videos must be recorded simultaneously.
  2. The reconstruction speed is not real-time; it requires seconds of per-frame training/optimization/reconstruction rather than the milliseconds required for real-time applications
  3. The output cannot be directly used in DCC Tools/Game Engines. However, efforts on engineering may make it come true since there are already some plugins for 3DGS in blender/unity/unreal.

As for your question. To the best of my knowledge, there are no real-time reconstruction methods (with pure RGB inputs in a simple capture setting) even for (complex, real-world) static scenes/objects, so I'm afraid that are no methods/techniques, at least in academia, can meet your needs (i.e., real-time reconstruction for real-world dynamic scene with videos as input), since "real-time" is a very tight requirement. But if you want to reconstruct a simple 4D scene/dynamic scene (e.g., few humans) in a complex capture setting, there are a lot of techniques (e.g., MoCap).

You can look for key-words like "dynamic reconstruction", "dynamic scene reconstruction" and "4D reconstruction" when digging in the literature. BTW, I just found some literature that may related to your requirements:

  1. Dou, Mingsong, et al. "Fusion4d: Real-time performance capture of challenging scenes." ACM Transactions on Graphics (ToG) 35.4 (2016): 1-13.
  2. Ingale, Anupama K. "Real-time 3D reconstruction techniques applied in dynamic scenes: A systematic literature review." Computer Science Review 39 (2021): 100338.

Hope this helps.

adrida commented 7 months ago

Thank you very much for your complete answer and all the references, I will have a closer look.

Based on your expertise, do you think the leap to get into real time reconstruction will require a significant breakthrough that could be realistically addressed with enough funding and ressources put in R&D?

SJoJoK commented 7 months ago

This is a trade-off between quality, speed and computing power. For example, you may get a very low-quality static/dynamic asset in real-time by ultra high-end GPU. So yes, I believe that the problem can be addressed by enough resources put in R&D.

adrida commented 7 months ago

I see, thanks a lot for your help!

SJoJoK commented 7 months ago

Glad to help:)