Closed shilei-ustcer closed 5 years ago
Hi, Thanks for the question! As the 3D representation is only a coarse voxel grid, even the best possible 3D representation can never lead a loss of exactly 0. This is because the depths rendered are at a much finer resolution, whereas the possible 'stopping depths' modeled in our loss for a ray traveling though a voxel grid is only a discrete set.
Thank you for your reply! So here comes a situation: the loss between ground truth shape and tested depth map may be larger than some predicted shape, am I right?
Supplement to above. Another situation: when neighbor pixels of depth map re-project to shape, neighbor rays may intersect in one voxel, thus the voxel state (to be 0 or 1) may conflict due to different depth signal with the two rays. This situation may happens.
Hi, Thanks for raising these points! Both the points you state are correct, and in fact we discuss these a bit in our paper's appendix (sec Sec A2.2 in https://arxiv.org/pdf/1704.06254.pdf ).
Hi, Another issue to bother again! Since rays re-project to shape, there is a situation: some voxel is not intersected by all the rays, so this voxel's state can not determined by depth map. I think this situation may also happen.
Yes, that is correct, but it's not an issue if we are using this loss to train a prediction CNN - if the images yielded no evidence for the voxel, the gradients from the loss would also be 0. It may be an issue if we are directly trying to optimize the volume given a set of views, in which case you'd need to use enough views and/or assume some prior.
As you say, " if the images yielded no evidence for the voxel, the gradients from the loss would also be 0", so the voxel's value stay fixed during training, it is only determined by weights initialization. So its value can be arbitrary, but we want it to be 0, empty. Is this an issue?
Well, we train a common CNN across all images, so if the hope is that some image(s) across all training data would have provided evidence, so the CNN would have learned to predict reasonable values. If there exist voxels that across all training data that did not get any evidence, then perhaps it may be an issue.
Yes, you are right. Thank you for your kindly reply. Since my problem has been solved, I will close this issue. Leans a lot from our discussion.
Great work, Shubham! When I run the codes on a single-view depth map, I test on a min-batch using only a depth map, I find that the loss can not decrease to 0, why?