Closed NIRVANALAN closed 5 months ago
By the way, what's the range of your radius? If it is floating, whether this will affect the reconstruction training performance?
Also, I wonder how the objects are normalized, are they all stay in [-1,1] cube? In this way we can use 3D bbox for ray sampling.
Hi, thanks for sharing this great dataset. I am trying to train a nerf reconstruction model using this dataset and wonder how to load the camera. I used the EG3D ShapeNet-SRN dataset tradition and found that the reconstruction training is not converging. I guess it is due to the c2w tradition?
It is possible. you can try:
def convert_pose(C2W):
flip_yz = np.eye(4)
flip_yz[1, 1] = -1
flip_yz[2, 2] = -1
C2W = np.matmul(C2W, flip_yz)
return torch.from_numpy(C2W)
c2w = convert_pose(c2w)
Also, I wonder how the objects are normalized, are they all stay in [-1,1] cube? In this way we can use 3D bbox for ray sampling.
Radius = 1 and objects all stay in [-0.5, 0.5]
Thanks for the quick response, let me give it a try!
Also, I wonder how the objects are normalized, are they all stay in [-1,1] cube? In this way we can use 3D bbox for ray sampling.
Radius = 1 and objects all stay in [-0.5, 0.5]
Hi, I had a check and confirmed all the objects stay in [-0.45, 0.45] as denoted in the camera json file. However, seems the camera trajectory radius of different objects are distinct: over the 40 views, the first 27 views have the same object radius and the remaining 13 have another (screenshot attached). And the two radiuses differ across different objects. While I guess this is not an issue considering the bbox is already provided, we can do ray casting within the [-0.45, 0.45] bbox and deprecate the ray_start, ray_end tradition here.
Thanks XD and will let you know if I have more questions!
@NIRVANALAN Thanks.
This is because we rendered it twice, the first time rendering the first 27 images (elevation angle spanning 5 to 30 degrees), and the second time rendering the next 13 images (elevation angle spanning -5 to 5 degrees). Each time rendering, the camera's radius is randomly selected from [1.5, 2.0]. so first 27 images share same camera's radius and next 13 images also share anthore same camera's radius
Thanks XD again for your careful clarification! I have resolved this problem and the nerf reconstruction trains well on your data. Close this issue for now.
Hi, thanks for sharing this great dataset. I am trying to train a nerf reconstruction model using this dataset and wonder how to load the camera. I used the EG3D ShapeNet-SRN dataset tradition and found that the reconstruction training is not converging. I guess it is due to the c2w tradition?