Open JINNMnm opened 6 months ago
Agree. Taking such augmentation makes the instantnerf so hard to converge...
I have the same question. Maybe the random noise is the one in the InstantMesh/src/model.py
cameras = cameras + torch.rand_like(cameras) * 0.04 - 0.02
And we should set camera_rotation as false at the start of training? Have you trained the model successfully?
I notice that the paper metioned "Considering that the multi-view images generated by Zero123++ may be inconsistent with their pre-defined camera poses, we also add random noise to the camera parameters before feeding them into the ViT image encoder." And when I check the code here: https://github.com/TencentARC/InstantMesh/blob/34c193cc96eebd46deb7c48a76613753ad777122/src/data/objaverse.py#L195 It takes a random degree that ranges (0,2*pi) and rotates along z axis. I think the range is a bit too big? I'm not sure is this range appropriate so can you confirm it?