Refine module input discrepancy

Hi! We observed some discrepancy between the code and the paper regarding the inputs that go through the Refine() module. On the paper (according to Figure 4.) the input is a concatenation of depth estimations comming from the 2 branches (Equirectangular and projected Cubemap) making it a 2 channel Tensor.

But on the code, the concatenation additionally includes the RGB input tensor making it a 5 channel Tensor mixing color and depth information: https://github.com/Yeh-yu-hsuan/BiFuse/blob/6fb1cbe8a3c3891a9067f595ba2af9d14f8ae1c6/models/FCRN.py#L522

Which one of the 2 approaches is the correct one?

Also, on the last projection to bring the Cubemap branch output to Equirectangular, the transformation is done in 2 steps: https://github.com/Yeh-yu-hsuan/BiFuse/blob/6fb1cbe8a3c3891a9067f595ba2af9d14f8ae1c6/models/FCRN.py#L519-L520

Couldn't the transformation self.ce.C2E() be directly applied to the Cubemap branch output (as the Fusion blocks do)?

yuhsuanyeh / BiFuse

Refine module input discrepancy #15