JiaRenChang / RealtimeStereo

Attention-Aware Feature Aggregation for Real-time Stereo Matching on Edge Devices (ACCV, 2020)
GNU General Public License v3.0
168 stars 29 forks source link

About image crop #8

Open heguohao0728 opened 3 years ago

heguohao0728 commented 3 years ago

Sorry to disturbe you again.. I notice that you do crop during training and crop the image to 288x576. I think this operation will lose the feature outside the 288x576, like the bule area: 微信截图_20210419175711 So why doing the crop like this? and I find that when testing, the image size is not been largely cropped (the size is 368x1232). Why the model trained by 288x576 can use on the 368x1232 and can still get the result? Is it because you also give the 1/n (1/16 in this case) disp when training? Thanks.

heguohao0728 commented 3 years ago

And How to consider the camera parameters? Because the camera which get datas in sceneflow datasets is different from the camera which is used in kitti datasets. Why different camera with different camera parameters can share and train in the same net?

JiaRenChang commented 3 years ago


In training, we use "random" crop.

The model does not need camera parameters because it predict "disparity" where disparity = focal length * stereo camera distance / depth.

heguohao0728 commented 3 years ago

Yes, I've known the formula. Sorry I'm not discribing the question very well, but I mean the beseline length is the same in these two datasets? And the sceneflow has 15mm and 35mm focal length, kitti has the other. You mean the focal length and the baseline length will not influence the net?

JiaRenChang commented 3 years ago

@heguohao0728 Yes, you can see the formula, the baseline and focal length are both normalized by depth.