prstrive / UniMVSNet

[CVPR 2022] Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation
MIT License
231 stars 12 forks source link

Some questions about the differences between UniMVS and CasMVS #14

Closed xy-guo closed 2 years ago

xy-guo commented 2 years ago

Thank you for your great work! I just went through your code, and I have two subtle questions about the paper and the code. Q1. The network structures of Unimvs and Casmvs are the same. I just wonder whether the 1st baseline in Table 3 (Baseline(Reg)) is CasMVS? The result in Table 3 is much better than CasMVS paper. What are the differences between Baseline(Reg) and CasMVS?

Q2. Assume the interval is 1. If the ground truth is 7.99, then the unity ground truth should be (1-0.99)/1.0=0.01. It seems that it will causes instability problem. If there is a predicted value for another depth value is much larger than 0.01, then the wta depth will go wrong. How does UniMVSNet handle this problem?

Looking forward to your kind reply.

prstrive commented 2 years ago

Q1. As we declared in our paper, the 1st baseline in Table 3 (Baseline(Reg)) is CasMVSNet in our implementation. It's well known that the quantitative metrics (acc., comp., overall) is heavily related to the fusion parameters, and we set the fusion parameters of all models in Table 3 to be the same (0.3 for confidence threshold and 3 for consistent view, same as RMVSNet), which is slightly different from the original CasMVSNet paper and get superior performance in overall metric.

Q2. It's indeed a thorny problem. We mitigate the drawback of these cases through our proposed unified focal loss to make the model pay more attention to these outliers. And we don't have statistical model performance at these outliers, but overall it's unimpeded.

xy-guo commented 2 years ago

Thank you for reply.