apple / ml-neuman

Official repository of NeuMan: Neural Human Radiance Field from a Single Video (ECCV 2022)
Other
1.26k stars 141 forks source link

Preprocessing my video can't get the Sparse scene reconstrution output. #32

Open uulo opened 2 years ago

uulo commented 2 years ago
When I use the /preprocess/run.sh to preprocess my video, I can't use the colmap to get sparse scene reconstrution. I get the error  "Finding good initial image pair: No good initial image pair found" 
My videos are fixed camera or multiperson in a video. If the proprocess is only used for the camera moving with people or not?

a

jiangwei221 commented 2 years ago

If there is only 1 fixed camera, then COLMAP will fail. If there are multiple cameras, the COLMAP should be able to reconstruct the camera poses.

domattioli commented 1 year ago

How is the preprocessing altered when you have, for example, multiple cameras of the same scene? Do you input the video data from those cameras as one, sequential, continuous video?

jiangwei221 commented 1 year ago

The first step is to reconstruct the scene, you can use one image per camera to reconstruct the background scene. Once you obtained the camera poses and sparse scene point cloud, then you can segment the group plane, estimate SMPL, estimate the scale, etc. (Assuming all the cameras are synced)One thing to note is that you can use multiple views to optimize the SMPL estimates, that's out of the scope of this repo, you can have a look at: https://github.com/zju3dv/EasyMocap

domattioli commented 1 year ago

Excuse my naivete but does this mean that an input video of a scene filmed by two synced cameras must be organized as: Frame 1: camera/perspective 1; Frame 1: camera/perspective 2; ... ; Frame N: camera/perspective 1; Frame N: camera/perspective 2?

Or should it be Frames 1-N, camera/perspective 1; then Frames 1-N, camera/perspective 2? Or should the synced footage of both cameras be juxtaposed for each frame?

jiangwei221 commented 1 year ago

I think you can check the ZJU-Mocap dataset, this dataset uses multiple cameras.

domattioli commented 1 year ago

So are you saying that ml-neuman currently cannot handle input data for multiple cameras, and that one would need to integrate EasyMocap within ml-neuman to do that?