Closed xy-guo closed 1 year ago
Nice to meet you, Xiaoyang!
I think the model could generalize to different depth ranges. At least the genralization on Tanks&Temples and ETH3D proves this ability. In our model, the (inverse) depth range are all normalized to [0, 1] and the depth values (1 center depth value + a fixed sampling pattern) are injected in each GRU iteration. We tried with predicting depth residuals as RAFT before. But sometimes the training is not very stable.
Thank you for your great work! The network is so fast to run with high precision. After I read the code, I have several questions about IterMVS.
Previous networks like CasMVS/VisMVS all use 3D conv to aggregation local information, and I think these networks have the generalization ability about the depth range/interval changes. In comparison, IterMVS predicts the full 256-dim distribution from the 2D hidden state. In my previous experience, a 2D feature seems to be hard to generalize to different depth ranges. Specifically, let us assume there is a training set with min-max depth to be [1, 10], and we set the depth range to be [1, 20] by force when training. Then we test the model with a testing set with depth range [1, 20], will the model be able to generalize in depth range [10, 20]? Will it be a better choice to predict the depth residuals instead of full depth distribution? If there is no depth-range scaling augmentation, will there be performance drop?