autonomousvision / unimatch

[TPAMI'23] Unifying Flow, Stereo and Depth Estimation
https://haofeixu.github.io/unimatch/
MIT License
980 stars 102 forks source link

Question on upsampling during training. #51

Closed jaehyungjung closed 4 months ago

jaehyungjung commented 4 months ago

Hi, thank you very much for your great work!

I'm trying to fine-train your GMstereo in my dataset.

I figured out that you don't upscale disparity during training step as is_depth=task == 'depth'. I was expecting that disparity (which is in pixel) should be scaled according to the image resolution. Also, I didn't find any down-scale from disparity label side.

# upsample to the original resolution for supervison at training time only
if self.training:
    flow_bilinear = self.upsample_flow(flow, None, bilinear=True, upsample_factor=upsample_factor,
                                       is_depth=task == 'depth')
    flow_preds.append(flow_bilinear)

https://github.com/autonomousvision/unimatch/blob/0dfa3616d89790ac3bac3810dcdedf691b40dfdd/unimatch/unimatch.py#L226C17-L226C23

I really appreciate if you can elaborate on this!

haofeixu commented 4 months ago

Hi, here it's just inverse depth (1/d), not exactly the same meaning as stereo disparity (displacement in pixel). Thus we don't need to upscale it (just like how we downsample/upsample depth).

jaehyungjung commented 4 months ago

Thank you for quick reply!

As far as I understand, in stereo case, flow is from global_correlation_softmax_stereo in scale_idx = 0. In global_correlation_softmax_stereo samples all pixels along horizontal pixels. Does that mean flow is actually disparity, not inverse depth?

haofeixu commented 4 months ago
Does that mean flow is actually disparity

Yes correct.

jaehyungjung commented 4 months ago

Thanks!