Poor results on PeopleSnapshot dataset

MoyGcc / vid2avatar

Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition (CVPR2023)

Other

1.23k stars 100 forks source link

Hi!

I'm trying to run your method on the PeopleSnapshot dataset. Since the cameras in that dataset are fixed at the origin with no rotation (while the person moves), I had to modify your sphere intersection code since it assumes that the scene is centered at the origin. I also didn't normalize the camera parameters (since there's no rot/trans) and set sdf_bounding_sphere=1.0 everywhere.

Would you be able to help me debug the results I'm seeing? It looks like the camera and geometry are in the correct locations (otherwise I wouldn't see anything in the render), but there's definitely something I'm doing wrong. Thank you!

The predicted mesh for 1 frame after 100 epochs looks like:

The predicted (foreground) render, mask, and normals are:

Hi,

I just quickly tried vid2avatar on the same sequence you showed and it looks fine to me (as we also tested on PeopleSnapShot before). The screenshot I attached is the canonical mesh after training for 50 epochs.

You could simply follow the preprocessing code to process the data (incl. camera normalization). If you want to fix the camera position, you could try replacing https://github.com/MoyGcc/vid2avatar/blob/main/preprocessing/preprocessing.py#L230 with the following code segment. It means we only normalize once for the first frame and afterward we follow the same normalization shift. That should work out.

            if idx == 0:
                v_max = smpl_verts.max(axis=0)
                v_min = smpl_verts.min(axis=0)
                normalize_shift = -(v_max + v_min) / 2.

Screenshot from 2023-11-28 10-38-51

MoyGcc / vid2avatar

Poor results on PeopleSnapshot dataset #61