About the generalization ability of IterMVS

Thank you for your great work! The network is so fast to run with high precision. After I read the code, I have several questions about IterMVS.

Previous networks like CasMVS/VisMVS all use 3D conv to aggregation local information, and I think these networks have the generalization ability about the depth range/interval changes. In comparison, IterMVS predicts the full 256-dim distribution from the 2D hidden state. In my previous experience, a 2D feature seems to be hard to generalize to different depth ranges. Specifically, let us assume there is a training set with min-max depth to be [1, 10], and we set the depth range to be [1, 20] by force when training. Then we test the model with a testing set with depth range [1, 20], will the model be able to generalize in depth range [10, 20]? Will it be a better choice to predict the depth residuals instead of full depth distribution? If there is no depth-range scaling augmentation, will there be performance drop?

FangjinhuaWang / IterMVS

About the generalization ability of IterMVS #3