How to evaluate the generated mesh and ground truth on DTU?

luoxiaoxuan commented 1 year ago

Thanks for your nice work!

The generated mesh have background, which will affect the results. How did you deal with it? Did you filter the generated mesh and GT mesh according to the image mask? Can you provide an evaluation code and the processed GT mesh?

luoxiaoxuan commented 1 year ago

@flamehaze1115 @xxlong0 I use the provided bash file, but set mode to 'test', and use the scripts DTUeval-python to evaluate the Chamfer Distance, but get poor performance. And then I use the image mask to filter the mesh by using occupancy_mask, the performance is poor too. The result as follows.

scan_37: 5.28(generated w/o filter), 4.22(generated w/ filter), 3.06(Table 1 in paper)
scan_65: 2.76(generated w/o filter), 2.83(generated w/ filter), 2.18(Table 1 in paper)
scan_106: 2.11(generated w/o filter), 2.59(generated w/ filter), 1.19(Table 1 in paper)

The visualization as follows. scan_37 (generated w/o filter)

scan_37 (generated w filter)

scan_65 (generated w/o filter)

scan_65 (generated w/ filter)

scan_106 (generated w/o filter)

scan_106 (generated w/ filter)

Can you share the evaluation details or provide an evaluation code?

flamehaze1115 commented 1 year ago

Hello. I have been very busy with a new project recently, sorry for the late. I just uploaded the evaluation code. Before the evaluation, we need to run clean_mesh.py to remove the faces from the masks. The clean_mesh.py is a bit messy, I will update it soon.

luoxiaoxuan commented 1 year ago

Thanks for your reply! And I just thought of a way to filter mesh, hope it works.

luoxiaoxuan commented 1 year ago

Hi @flamehaze1115 , I have use your clean_mesh.py and eval_dtu_python.py. The evaluation turn out better, but it's still not as good as the paper.

Original: scan_37: 5.28(generated w/o filter), 4.22(generated w/ filter), 3.06(Table 1 in paper) scan_65: 2.76(generated w/o filter), 2.83(generated w/ filter), 2.18(Table 1 in paper) scan_106: 2.11(generated w/o filter), 2.59(generated w/ filter), 1.19(Table 1 in paper)

Now: scan_37: 3.74(clean_mesh.py), 3.06(Table 1 in paper) scan_65: 2.41(clean_mesh.py), 2.18(Table 1 in paper) scan_106: 1.62(clean_mesh.py), 1.19(Table 1 in paper)

In addition, I look at eval_dtu_python.py, and observe that you seem to be using the ground-truth point clouds from different sources (i.e. GT_DIR = "./gt_pcd"). Have you handled gt point clouds by yourself? And can you provide it ?

flamehaze1115 commented 1 year ago

Hi @flamehaze1115 , I have use your clean_mesh.py and eval_dtu_python.py. The evaluation turn out better, but it's still not as good as the paper.

Original: scan_37: 5.28(generated w/o filter), 4.22(generated w/ filter), 3.06(Table 1 in paper) scan_65: 2.76(generated w/o filter), 2.83(generated w/ filter), 2.18(Table 1 in paper) scan_106: 2.11(generated w/o filter), 2.59(generated w/ filter), 1.19(Table 1 in paper)

Now: scan_37: 3.74(clean_mesh.py), 3.06(Table 1 in paper) scan_65: 2.41(clean_mesh.py), 2.18(Table 1 in paper) scan_106: 1.62(clean_mesh.py), 1.19(Table 1 in paper)

In addition, I look at eval_dtu_python.py, and observe that you seem to be using the ground-truth point clouds from different sources (i.e. GT_DIR = "./gt_pcd"). Have you handled gt point clouds by yourself? And can you provide it ?

Hello. As the test images we provided, for each scene of DTU, we choose two set of three images to reconstruct two independent results and report the average evaluation number in our paper. If you don't use the testing images we provided, the number will be different with those we reported in paper.

luoxiaoxuan commented 1 year ago

@flamehaze1115

Hi @flamehaze1115 , I have use your clean_mesh.py and eval_dtu_python.py. The evaluation turn out better, but it's still not as good as the paper. Original: scan_37: 5.28(generated w/o filter), 4.22(generated w/ filter), 3.06(Table 1 in paper) scan_65: 2.76(generated w/o filter), 2.83(generated w/ filter), 2.18(Table 1 in paper) scan_106: 2.11(generated w/o filter), 2.59(generated w/ filter), 1.19(Table 1 in paper) Now: scan_37: 3.74(clean_mesh.py), 3.06(Table 1 in paper) scan_65: 2.41(clean_mesh.py), 2.18(Table 1 in paper) scan_106: 1.62(clean_mesh.py), 1.19(Table 1 in paper) In addition, I look at eval_dtu_python.py, and observe that you seem to be using the ground-truth point clouds from different sources (i.e. GT_DIR = "./gt_pcd"). Have you handled gt point clouds by yourself? And can you provide it ?

Hello. As the test images we provided, for each scene of DTU, we choose two set of three images to reconstruct two independent results and report the average evaluation number in our paper. If you don't use the testing images we provided, the number will be different with those we reported in paper.

Thanks for your reply. The above results are the results of using the test images you provided. My question is whether you filtered the GT point clouds, because only three images were used for reconstruction, which means that the parts outside the images could not be reconstruct. Did the unseen parts of the GT point clouds are involed in the evaluation?

xxlong0 / SparseNeuS

How to evaluate the generated mesh and ground truth on DTU? #23