gallenszl / CFNet

CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching(CVPR2021)
MIT License
155 stars 23 forks source link

preprocess for predicting Custom dataset #26

Open jahad9819jjj opened 2 years ago

jahad9819jjj commented 2 years ago

Hi. I am thinking of applying your method to my own custom dataset. So, I added the following code to save_disp.py's main with reference to datasets/sceneflow_dataset.py.

# test one sample
# @make_nograd_func
# def test_sample(sample):
#     model.eval()
#     disp_ests, pred1_s3_up, pred2_s4 = model(sample['left'].cuda(), sample['right'].cuda())
#     return disp_ests[-1]
@make_nograd_func
def test_sample(left, right):
    model.eval()
    disp_ests, pred1_s3_up, pred2_s4 = model(left.cuda(), right.cuda())
    return disp_ests[-1]

if __name__ == '__main__':
    left_img = Image.open("/media/A/left/0.png").convert("RGB")
    right_img = Image.open("/media/A/right/0.png").convert("RGB")

    w, h = left_img.size
    crop_w, crop_h = 950, 512
    left_img = left_img.crop((w-crop_w, h-crop_h, w, h))
    right_img = right_img.crop((w-crop_w, h-crop_h, w, h))

    processed = get_transform()
    left_img = processed(left_img)
    right_img = processed(right_img)
    test_sample(left_img, right_img)

Then I get the following error.

Mish activation loaded...
Mish activation loaded...
Mish activation loaded...
Mish activation loaded...
Mish activation loaded...
  File "/home/ubuntu/Apps/CFNet/models/cfnet.py", line 136, in forward
    x = self.firstconv(x)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 446, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 442, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 3, 3, 3], but got 3-dimensional input of size [3, 512, 950] instead

Probably this is due to wrong input to the preprocessing network.

how can I generate a disparity image with a custom dataset?

rs220122 commented 2 years ago

you should unsqueeze to left_image and right_image. model input assumes that the image is batch.

nickthorpie commented 2 years ago

I updated to @make_nograd_func def test_sample(left, right): model.eval() left=left.unsqueeze(0) right=right.unsqueeze(0) disp_ests, pred1_s3_up, pred2_s4 = model(left.cuda(), right.cuda()) return disp_ests[-1]

Also crop_w should be 960 as per save_disp example

I'm now receiving error Calculated padded input size per channel: (2 x 18 x 32). Kernel size: (3 x 3 x 3). Kernel size can't be greater than actual input size

Also tried other unsqueeze sizes, but get a channel mismatch