Closed Yifehuang97 closed 6 months ago
I think the depth estimation in Stage 1 is well-trained since an EPE around 1.767 is reasonable. However, the failure case in fig2 is confusing. The obvious mismatch of two partial Gaussian points is witnessed, and I think it is mainly caused by camera parameters. I suggest saving the point clouds of both views either in Stage 1 or Stage 2. Also, what is the EPE metric in Stage 2?
Thanks for your reply!
This is the EPE metric of validation set in stage 2:
For saving the point clouds, do you mean to call the depth2pc to get the 3D position to see if the result of left/right is consistent?
Thank you for your help!
The EPE also seems reasonable : )
To save the point cloud, just save the 'xyz' and 'img' in the valid region as the vertices and colors of the point cloud using trimesh[https://trimesh.org/]. Take the following code as an example.
for view in ['lmain', 'rmain']:
valid_i = data[view]['pts_valid'][0, :] # [S*S]
xyz_i = data[view]['xyz'][0, :, :] # [S*S, 3]
rgb_i = data[view]['img'][0, :, :, :].permute(1, 2, 0).view(-1, 3) # [S*S, 3]
xyz_i = xyz_i[valid_i].view(-1, 3)
rgb_i = rgb_i[valid_i].view(-1, 3)
rgb_i = (rgb_i + 1.0) * 0.5 * 255
ply_out = trimesh.points.PointCloud(vertices=xyz_i.detach().cpu().numpy(), colors=rgb_i.detach().cpu().numpy())
ply_out.export(OUTPUT_PATH + '/%s_%s.ply' % (data['name'], view))
Thank you so much!
I apologize for the inconvenience of reaching out once more. I've resolved a bug in my previous code, which now produces reasonable results. However, it appears that both the L1 loss and SSIM loss are not converging as expected. Furthermore, the quality of the rendered images does not seem to improve with additional training iterations. Could you suggest any potential reasons for this issue?
Results from the first iteration on the training set:
Results after 100,000 iterations on the training set:
Here is the training loss:
Thank you so much, and sorry for the inconvenience.
Sorry for the late reply. I got a similar result when I trained under a half-body setup where the severe self-occlusion caused many holes in novel views. The large-scale Gaussians are predicted to compensate for the missing areas. However, it should not be the reason for the head NVS since there is not any occlusion. I think you can manually set the scale of Gaussians to zero in the Gaussian rasterization and compare the rendered novel view image and the ground truth. I think there still exists a mismatch between them which is potentially caused by the incorrect camera parameters.
Thank you!
Hi, thanks for your work! When I am using the default setting on avatar, I saw some failure cases like:
It seems the Gaussian regression network fails to learn how to combine the Gaussian from the left/right views. Also, the stage 2 loss seems didn't converge well.
I also am not sure whether the stage 1 result is good enough; the final validation EPE is around 1.767.
Thanks for your time!