gengshan-y / viser

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction. NeurIPS 2021.
https://viser-shape.github.io/
Apache License 2.0
73 stars 6 forks source link

Videos without static camera #6

Closed Iven-Wu closed 2 years ago

Iven-Wu commented 2 years ago

Hi, I'm trying to use Viser or LASR on videos that camera may rotate around the objects. But the results assume the camera is fixed. Have you ever tried videos where the camera has large motions?

gengshan-y commented 2 years ago

LASR/Viser treat camera motion as object root body motion, so there should be no difference.

LASR has a toy example (spot), where the camera rotates around the object and the rotation angle is 120 deg between each frame.

For the data you are using, it is possible that either there are too many frames, or the flow estimation is not good enough. I can provide more suggestions if you give more info.

Iven-Wu commented 2 years ago

There are about 180 frames in each video. And objects in videos are moving, along with the camera rotating around the objects irregularly. Moreover, in some cases, the animals are partially observed, such as the demo for elephants on the homepage. And for the flow estimation, I use the model robust_vcn from preprocessing steps in LASR.

gengshan-y commented 2 years ago

Directly optimizing >30 frames would be difficult, if there is no initial root/cam poses. To optimize for >30 frames, we did it in two stages, where the first stage optimizes a subset of <30 frames and learns a pose CNN/embedding. The second stage uses the pose CNN/embedding to initialize the rest of the frames.

For animals and human, you may want to check out our latest work banmo, which is more robust.

According to the code, the z-axis seems not to be the up-axis.

We use opencv convention, X-right, Y-down, Z-out (positive): https://docs.opencv.org/3.4/d9/d0c/group__calib3d.html.

Iven-Wu commented 2 years ago

Thanks! That helps a lot.

Iven-Wu commented 2 years ago

I've tried multiple videos with only 30 frames, but it still failed in the first epoch. And the ground truth camera, I rotate it to the opencv coordinate, but it cannot project accurate mask as expected. Is there any other modification on mesh?

gengshan-y commented 2 years ago

Perhaps you can attach some failure cases here, or send it directly to me?

Iven-Wu commented 2 years ago

Sorry, it turns out to be my mistake. Thank you a lot!

Iven-Wu commented 1 year ago

这是来自QQ邮箱的假期自动回复邮件。   您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。