yuhsuanyeh / BiFuse

[CVPR2020] BiFuse: Monocular 360 Depth Estimation via Bi-Projection Fusion
MIT License
173 stars 28 forks source link

Refine module input discrepancy #15

Open dvansa-eurecat opened 3 years ago

dvansa-eurecat commented 3 years ago

Hi! We observed some discrepancy between the code and the paper regarding the inputs that go through the Refine() module. On the paper (according to Figure 4.) the input is a concatenation of depth estimations comming from the 2 branches (Equirectangular and projected Cubemap) making it a 2 channel Tensor.

But on the code, the concatenation additionally includes the RGB input tensor making it a 5 channel Tensor mixing color and depth information: https://github.com/Yeh-yu-hsuan/BiFuse/blob/6fb1cbe8a3c3891a9067f595ba2af9d14f8ae1c6/models/FCRN.py#L522

Which one of the 2 approaches is the correct one?

Also, on the last projection to bring the Cubemap branch output to Equirectangular, the transformation is done in 2 steps: https://github.com/Yeh-yu-hsuan/BiFuse/blob/6fb1cbe8a3c3891a9067f595ba2af9d14f8ae1c6/models/FCRN.py#L519-L520

Couldn't the transformation self.ce.C2E() be directly applied to the Cubemap branch output (as the Fusion blocks do)?