Open mldemox opened 3 weeks ago
The currently released 3D reconstruction code uses $GIM_{DKM}$ (dense matching) for matching and then reconstruction. Due to the characteristics of the dense matching method, a single image cannot obtain stable candidate points, so gridization of the matching points is required before reconstruction. This gridization step will reduce the accuracy of matching, which further affects the accuracy of camera pose estimation during reconstruction.
I personally recommend using $GIM{DKM}$ for matching and reconstruction between images with large camera view changes and high reconstruction difficulty. Since there are no cases of large camera view changes between the images in MipNeRF360, $GIM{DKM}$ is not needed.
To meet your need for 3dgs in MipNeRF360, I will publish the reconstruction code for $GIM{LightGlue}$ (sparse matching) in a few days. Sparse matching does not have the gridization step, does not lose matching accuracy, and has sufficient performance to handle MipNeRF360's data. At that time, you can try the reconstruction results of $GIM{LightGlue}$.
Thanks for the reply, looking forward to the subsequent code release.
@mldemox You can try reconstruction.sh
with $GIM_{LightGlue}$. I have updated the code.
Thank you for your reply, but when I used the above command to generate the camera poses for the MipNeRF360 dataset, I went back to the sparse reconstruction and dense reconstruction respectively, and the data reconstructed using sparse, and conducted the labs found that it was still not as good as the original results generated by colmap, as we all know It is well known that the reconstruction performance of 3dgs is related to the initialised point cloud. Even the results of the room scene in the sample code are equally unsatisfactory.
Hi, @mldemox, when I integrated $GIM{LightGlue}$ into my reconstruction code, I realized that the GIM version of LightGlue was only trained with 2048 keypoints, while generally, for reconstruction, at least 8192 keypoints might be preferable. However, this would require retraining $GIM{LightGlue}$.
From your feedback, it seems that the pose reconstruction using the code I provided is not as accurate as that provided by MipNeRF360. This might be due to the different processing pipeline and many different hyperparameters between the two methods. I guess you did not simply replace the default SIFT in colmap with $GIM{DKM}$ or $GIM{LightGlue}$ for ablation comparison. Anyway, the current results should be as you have reported.
Finally, I am curious which one, $GIM{DKM}$ or $GIM{LightGlue}$, yields a higher PSNR when used for reconstruction and rendering in your case?
Thank you for your excellent work, when I use the reconstruction.sh in the repository to predict multiple scenes in MipNeRF360, I find that the reconstruction is not as good as the result in the original colmap reconstruction, do you have a good parameter file for the prediction?