Closed Riser6 closed 10 months ago
And the loss only consists of the rendered color and depth loss.
Hi, you may have a look at the supplementary about why we choose NeuS for volume rendering. In VolSDF and NeuS, eikonal loss is used as a regularization for SDF. But it cannot completely ensure that the field 'must' be a valid SDF field, i.e. with a gradient as 1 everywhere. Different from SDF, the property of SRDF is that the gradient along each given ray is 1. We tried with adding a loss using gt SRDF values for samples in front of the surface. However, we find that the performance dropped.
Thanks for your reply! So I think that there is still no any regularization term (like gt SRDF supervision) to ensure that the network learns sdrf instead of sdf. Is that so? Since I noticed that you adopt TSDF fusion to fuse the rendering depth map in a volume and then use Marching Cube to extract the mesh. So this way only requires reasonable rendering depth map to get reasonable mesh no matter what the network actually learned(sdf or srdf). While Neus directly extract mesh based on the output of the network, which proved the network actually learned sdf of sampling point. I guess I may be missing some important part in your work, hope you can clear my confusion!
Hi D.Wu, SDF depends only on position while SRDF is view-dependent and our formulation is view-dependent (ray-transformer). Also, as illustrated with a toy example in the supplementary, SRDF rendering is similar to SDF rendering. We tested directly fusing srdf volume (treat it as SDF)and the result is worse. Though we don't regularize SRDF, the learned field looks reasonable. Here is an example of srdf in below view, at h=102. x-axis is 640, are the corresponding to the pixels at h=102 line in the reference view. y-axis has 64 values is the sampled points. Hope it helps!
I assume your question is answered, feel free to reopen it : )
Thanks for the great work! But I notice that as your paper said, you adopt the way of volume rendering SDF (Neus) to volume render SRDF. So how does this guarantee that the network learns the srdf of the sampling point instead of the sdf?