modelscope / richdreamer

Live Demo:https://modelscope.cn/studios/Damo_XR_Lab/3D_AIGC
https://aigc3d.github.io/richdreamer/
Apache License 2.0
362 stars 13 forks source link

How to run nerf reconstruction with the given code #12

Closed NIRVANALAN closed 5 months ago

NIRVANALAN commented 5 months ago

Hi, thanks for sharing this great dataset. I am trying to train a nerf reconstruction model using this dataset and wonder how to load the camera. I used the EG3D ShapeNet-SRN dataset tradition and found that the reconstruction training is not converging. I guess it is due to the c2w tradition?

NIRVANALAN commented 5 months ago

By the way, what's the range of your radius? If it is floating, whether this will affect the reconstruction training performance?

NIRVANALAN commented 5 months ago

Also, I wonder how the objects are normalized, are they all stay in [-1,1] cube? In this way we can use 3D bbox for ray sampling.

gxd1994 commented 5 months ago

Hi, thanks for sharing this great dataset. I am trying to train a nerf reconstruction model using this dataset and wonder how to load the camera. I used the EG3D ShapeNet-SRN dataset tradition and found that the reconstruction training is not converging. I guess it is due to the c2w tradition?

It is possible. you can try:

def convert_pose(C2W):
    flip_yz = np.eye(4)
    flip_yz[1, 1] = -1
    flip_yz[2, 2] = -1
    C2W = np.matmul(C2W, flip_yz)
    return torch.from_numpy(C2W)
c2w = convert_pose(c2w)
gxd1994 commented 5 months ago

Also, I wonder how the objects are normalized, are they all stay in [-1,1] cube? In this way we can use 3D bbox for ray sampling.

Radius = 1 and objects all stay in [-0.5, 0.5]

NIRVANALAN commented 5 months ago

Thanks for the quick response, let me give it a try!

NIRVANALAN commented 5 months ago

Also, I wonder how the objects are normalized, are they all stay in [-1,1] cube? In this way we can use 3D bbox for ray sampling.

Radius = 1 and objects all stay in [-0.5, 0.5]

Hi, I had a check and confirmed all the objects stay in [-0.45, 0.45] as denoted in the camera json file. However, seems the camera trajectory radius of different objects are distinct: over the 40 views, the first 27 views have the same object radius and the remaining 13 have another (screenshot attached). And the two radiuses differ across different objects. While I guess this is not an issue considering the bbox is already provided, we can do ray casting within the [-0.45, 0.45] bbox and deprecate the ray_start, ray_end tradition here.

Thanks XD and will let you know if I have more questions!

image

gxd1994 commented 5 months ago

@NIRVANALAN Thanks.
This is because we rendered it twice, the first time rendering the first 27 images (elevation angle spanning 5 to 30 degrees), and the second time rendering the next 13 images (elevation angle spanning -5 to 5 degrees). Each time rendering, the camera's radius is randomly selected from [1.5, 2.0]. so first 27 images share same camera's radius and next 13 images also share anthore same camera's radius

NIRVANALAN commented 5 months ago

Thanks XD again for your careful clarification! I have resolved this problem and the nerf reconstruction trains well on your data. Close this issue for now.