TencentARC / InstantMesh

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
Apache License 2.0
3.41k stars 368 forks source link

Camera Augmentation in the released training code #88

Open JINNMnm opened 6 months ago

JINNMnm commented 6 months ago

I notice that the paper metioned "Considering that the multi-view images generated by Zero123++ may be inconsistent with their pre-defined camera poses, we also add random noise to the camera parameters before feeding them into the ViT image encoder." And when I check the code here: https://github.com/TencentARC/InstantMesh/blob/34c193cc96eebd46deb7c48a76613753ad777122/src/data/objaverse.py#L195 It takes a random degree that ranges (0,2*pi) and rotates along z axis. I think the range is a bit too big? I'm not sure is this range appropriate so can you confirm it?

HaFred commented 3 months ago

Agree. Taking such augmentation makes the instantnerf so hard to converge...

pupiljia commented 1 month ago

I have the same question. Maybe the random noise is the one in the InstantMesh/src/model.py cameras = cameras + torch.rand_like(cameras) * 0.04 - 0.02 And we should set camera_rotation as false at the start of training? Have you trained the model successfully?