Closed saxenarohan97 closed 6 years ago
Invalid pixels are represented as NaN within the network, not zero. When downsampling sparse data like KITTI, invalid pixels are ignored (see downsample_layer.cu#L52). This is the best you can do; there is no more information in that data to work on, and you have no way to encode subpixel information.
What is the disadvantage of upsampling the predicted-flow to the ground-truth's resolution [...]? The computational load is not that relevant. But from a data perspective, upsampling an optical flow or disparity image is no better than downsampling another. In fact, downsampling KITTI images makes the data appear denser! Downsampling does not destroy the data semantics because invalid pixels are ignored.
Invalid pixels are represented as NaN within the network, not zero.
I see. Since the raw KITTI data uses zero disparity values as a marker for invalidity, I thought that is what is also used in the network. I understand the downsampling layer now, thanks.
@nikolausmayer related questions: which resizing algorithm does the Downsample
layer use? Is it bilinear interpolation?
Also, what is the Resample
layer, and how is it different from the Downsample
layer?
@MrRoboticist Downsample
uses (IIRC) bilinear interpolation with a threshold: if the source contains too many NaN values, the output is NaN as well (see downsample_layer.cu#L63).
Resample
supports multiple algorithms, but has no backward pass. We use it for up- and downsampling in a deployed network. It is not used during training.
Hi,
As you mention here, the
Downsample
layer downsamples the ground-truth blob to the size of the predicted-flow blob. Won't this be a problem while finetuning with KITTI images?KITTI images only contain sparse ground-truth, with invalid pixels containing zero disparity values. If you resize the ground truth, won't it mess with the semantics of this labelling? What is the disadvantage of upsampling the predicted-flow to the ground-truth's resolution - is it the increased computation for processing larger resolutions?