xxlong0 / SparseNeuS

SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse views
MIT License
319 stars 16 forks source link

Some confusion in Table 1 in the paper. #2

Closed zhuixunforever closed 1 year ago

zhuixunforever commented 1 year ago

Thank you for your nice work, but I'm confused about Table 1.

In Table 1, the mean chamfer of IDR is 3.39, UniSurf is 4.39, Neus is 4.00, Colmap is 1.52. But the chamfer of IDR is 0.9, UniSurf is 1.02, Neus is 0.84 in their own paper. In addition, colmap with trim=7 is 0.65, colmap with trim=0 is 1.36 in IDR's paper.

What's the difference between your results and ones of their.

Looking forward to your reply! Thank you.

flamehaze1115 commented 1 year ago

Thanks for your interest. The difference is that, in our setting, only three images are utilized for reconstruction. The prior works like NeuS, Volsdf, and IDR make use of 49 or 64 images for reconstruction on DTU, and report the reconstruction results in their papers. These works cannot handle such an extremely sparse setting, which is the motivation of our work.

II-Matto commented 1 year ago

How is the chamfer distance computed exactly in this paper? Could you provide the equations and detailed settings (e.g. using all points, using point pairs with distances smaller than a threshold, L1/L2 distance, removing plane points, removing background points, points sampling, etc.)? Many thanks.

zhuixunforever commented 1 year ago

Thanks for your interest. The difference is that, in our setting, only three images are utilized for reconstruction. The prior works like NeuS, Volsdf, and IDR make use of 49 or 64 images for reconstruction on DTU, and report the reconstruction results in their papers. These works cannot handle such an extremely sparse setting, which is the motivation of our work.

Thank you for your reply. I have another problem. Where did the results of the other methods in Table 1 come from? Is that you train their methods by yourself with sparse views?

flamehaze1115 commented 1 year ago

Thanks for your interest. The difference is that, in our setting, only three images are utilized for reconstruction. The prior works like NeuS, Volsdf, and IDR make use of 49 or 64 images for reconstruction on DTU, and report the reconstruction results in their papers. These works cannot handle such an extremely sparse setting, which is the motivation of our work.

Thank you for your reply. I have another problem. Where did the results of the other methods in Table 1 come from? Is that you train their methods by yourself with sparse views?

Yes. I trained their models using just three images to report the numbers.

zhuixunforever commented 1 year ago

Thanks for your interest. The difference is that, in our setting, only three images are utilized for reconstruction. The prior works like NeuS, Volsdf, and IDR make use of 49 or 64 images for reconstruction on DTU, and report the reconstruction results in their papers. These works cannot handle such an extremely sparse setting, which is the motivation of our work.

Thank you for your reply. I have another problem. Where did the results of the other methods in Table 1 come from? Is that you train their methods by yourself with sparse views?

Yes. I trained their models using just three images to report the numbers.

Thank you, and I have no problem now.

flamehaze1115 commented 1 year ago

How is the chamfer distance computed exactly in this paper? Could you provide the equations and detailed settings (e.g. using all points, using point pairs with distances smaller than a threshold, L1/L2 distance, removing plane points, removing background points, points sampling, etc.)? Many thanks.

We use the code for evaluation: https://github.com/jzhangbs/DTUeval-python Since only three images are provided, all the prior implicit neural rendering methods like NeuS and Volsdf always have inaccurate geometries in scene backgrounds. To make the comparisons meaningful, we use the object masks provided by IDR to only calculate the chamfer distance of the foreground parts. The object masks don't include the "planes" of DTU datasets.

II-Matto commented 1 year ago

@flamehaze1115 Is the computation process as follows?

  1. Obtain mesh with the proposed SparseNeuS using three input images and sample points to obtain point cloud with the DTUeval-python code.
  2. For both the SparseNeuS and the groundtruth point cloud, filter points according to the object masks provided by IDR.
  3. Compute the distance from the SparseNeuS point cloud to the GT one to obtain $d_1$ with the DTUeval-python code.
  4. Compute the distance from the GT point cloud to the SparseNeuS one to obtain $d_2$ with the DTUeval-python code.
  5. Compute the chamfer distance as $d = d_1 + d_2$.
flamehaze1115 commented 1 year ago

@flamehaze1115 Is the computation process as follows?

  1. Obtain mesh with the proposed SparseNeuS using three input images and sample points to obtain point cloud with the DTUeval-python code.
  2. For both the SparseNeuS and the groundtruth point cloud, filter points according to the object masks provided by IDR.
  3. Compute the distance from the SparseNeuS point cloud to the GT one to obtain d1 with the DTUeval-python code.
  4. Compute the distance from the GT point cloud to the SparseNeuS one to obtain d2 with the DTUeval-python code.
  5. Compute the chamfer distance as d=d1+d2.

The DTUeval-python code will output d1, d2, and (d1+d2)/2.

II-Matto commented 1 year ago

@flamehaze1115 Got it. So the code is used directly without any modification. I thought you modified the code yourself previously. Many thanks.