wangys16 / FreeSplat

Official implementation of NeurIPS 2024 paper: "FreeSplat: Generalizable 3D Gaussian Splatting Towards Free-View Synthesis of Indoor Scenes"
69 stars 2 forks source link

Consulting for the use of PTF #8

Open DavidYan2001 opened 3 days ago

DavidYan2001 commented 3 days ago

Dear authors,

Thans for your nice work!

When I read the paper, I am curious about one module--the PTF module. In my understanding, the most important thing for PTF is to decrease gaussian redundancy. In the ablation study of the paper, from row 1 and row 3, it can be observed that PTF can also obviously increase the reconstruction accracy. Why is that, is there any key point that PTF is important for the reconstruction quality?

Appreciate it if you can help with this! Thanks!

wangys16 commented 2 days ago

Thank you for your interest in our work! Basically, the primal propose of PTF module is to reduce GS redundancy while meantaining resolution when given long sequence of inputs. The reason why PTF can also improve reconstruction quality, we analysis is that it conducts a point-level post fusion between multi views with extra learnable parameters. Besides, its design is inspired by TSDF Fusion where the gaussians unprojected from slightly erroneous depth maps can be "pulled" towards the accumulated regions which have been observed by multiple times because those gaussians have gained larger weights through PTF.

DavidYan2001 commented 1 hour ago

Dear author,

Thanks for your reply! I have one more qustion concerning about long sequence reconstruction. The performance shows that the long sequence reconstruction leads to worse performance in view interpolation compared with 2/3 views as input, and in the paper it is mentioned that this is "due to the complicated camera trajectories in ScanNet, and the inaccuracy of 3D Gaussian localization that leads to errors when observed from wide view ranges". In my understanding, the use of PTF and other modules can help to localize the gaussians more robustly in a weighted sum manner, right? Thus it is confusing that after long sequence reconstruction, the errors of the 3D gaussians can be even larger.

Hope to hear from you soon!