ewrfcas / MVSFormer

Codes of MVSFormer: Multi-View Stereo by Learning Robust Image Features and Temperature-based Depth (TMLR2023)
Apache License 2.0
175 stars 10 forks source link

关于Tanks & Temples测试过程的视图数量 #10

Closed JianfeiJ closed 1 year ago

JianfeiJ commented 1 year ago

您好,我看到论文在Tanks & Temples测试过程中将视图数量提升到20,并且取得了不错的效果。但是,在pair.txt文件中只有10个候选视图,请问您是如何选择20个视图的呢?谢谢!

ewrfcas commented 1 year ago

您好,我们是参考MVSNet paper中的方法重新计算colmap结果扩展视角的。

TruongKhang commented 5 months ago

Hello @ewrfcas , did you use COLMAP or OpenMVG to estimate camera poses and sparse points from Structure-from-Motion?

ewrfcas commented 5 months ago

Hello @ewrfcas , did you use COLMAP or OpenMVG to estimate camera poses and sparse points from Structure-from-Motion?

For the public testset (DTU and Tanks and Temple), we used the officially released camera poses instead of re-running SFM. For the new real-world scenes with several images, we recommend using SFM methods (COLMAP) to achieve camera poses before the MVS. Please follow https://github.com/jzhangbs/Vis-MVSNet for more details.

TruongKhang commented 5 months ago

I get your points but for Tanks&Temples, you directly used camera poses from MVSNet, right? But the pair.txt file of MVSNet only has 10 views. So, I wonder how you get the new pair.txt file with 20 views? I am trying to reproduce the camera poses of MVSNet on Tanks&Temples by using COLMAP, but I cannot get the same camera poses.

ewrfcas commented 5 months ago

I get your points but for Tanks&Temples, you directly used camera poses from MVSNet, right? But the pair.txt file of MVSNet only has 10 views. So, I wonder how you get the new pair.txt file with 20 views? I am trying to reproduce the camera poses of MVSNet on Tanks&Temples by using COLMAP, but I cannot get the same camera poses.

If you want to get a new pair.txt with more views than 10, you could resort camera poses provided by MVSNet according to the equation in the view selection of MVSNet's paper (page9, https://arxiv.org/pdf/1804.02505.pdf).

ewrfcas commented 5 months ago

Please see https://github.com/YoYo000/MVSNet/blob/3ae2cb2b72c6df58ebcb321d7d243d4efd01fbc5/mvsnet/colmap2mvsnet.py#L383 for more details.

TruongKhang commented 5 months ago

Thank you for your detailed construction. I also used this colmap2mvsnet.py file, but to use this, we need an SfM model from COLMAP. But, the MVSNet repo does not provide the sfm models of the Tanks&Temples dataset. So, if you don't need to re-run COLMAP on Tanks&Temples, how do you get more views? Do you directly select the 20 nearest neighboring views based on the available camera poses?

TruongKhang commented 5 months ago

I am developing a new SfM model and want to test Tanks&Temples using my SfM results and your MVSFormer. So I only wonder how the authors of MVSNet can get very accurate camera poses. I try to use COLMAP to reproduce the camera poses, but the poses are not accurate as the provided

maybeLx commented 5 months ago

We can employ the provided camera pose and camera intrinsic parameters for sparse reconstruction. Subsequently, we can utilize the code supplied by MVSNet to compute scores for the par.txt file. Nevertheless, this process can be cumbersome, we later found an alternative method with similar performance that we augment the original pair.txt file with the nearest neighbors of the candidate view. Nearest neighbors of these candidates have been already provided in pair.txt, i.e, the "neighbors of neighbors".

TruongKhang commented 5 months ago

I got it. Thank you!!!