FangjinhuaWang / IterMVS

Official code of IterMVS (CVPR 2022)
MIT License
166 stars 17 forks source link

About the generalization ability of IterMVS #3

Closed xy-guo closed 1 year ago

xy-guo commented 2 years ago

Thank you for your great work! The network is so fast to run with high precision. After I read the code, I have several questions about IterMVS.

Previous networks like CasMVS/VisMVS all use 3D conv to aggregation local information, and I think these networks have the generalization ability about the depth range/interval changes. In comparison, IterMVS predicts the full 256-dim distribution from the 2D hidden state. In my previous experience, a 2D feature seems to be hard to generalize to different depth ranges. Specifically, let us assume there is a training set with min-max depth to be [1, 10], and we set the depth range to be [1, 20] by force when training. Then we test the model with a testing set with depth range [1, 20], will the model be able to generalize in depth range [10, 20]? Will it be a better choice to predict the depth residuals instead of full depth distribution? If there is no depth-range scaling augmentation, will there be performance drop?

FangjinhuaWang commented 2 years ago

Nice to meet you, Xiaoyang!

I think the model could generalize to different depth ranges. At least the genralization on Tanks&Temples and ETH3D proves this ability. In our model, the (inverse) depth range are all normalized to [0, 1] and the depth values (1 center depth value + a fixed sampling pattern) are injected in each GRU iteration. We tried with predicting depth residuals as RAFT before. But sometimes the training is not very stable.