JiaRenChang / RealtimeStereo

Attention-Aware Feature Aggregation for Real-time Stereo Matching on Edge Devices (ACCV, 2020)
GNU General Public License v3.0
165 stars 29 forks source link

About cost volume fusion #7

Closed heguohao0728 closed 3 years ago

heguohao0728 commented 3 years ago

Hi, sorry to disturb you again...

In your code RT_Strero.py,when dealing with the cost volume, have codes: for scale in range(len(feats_l)): if scale > 0: wflow = F.upsample(pred[scale - 1], (feats_l[scale].size(2), feats_l[scale].size(3)), mode='bilinear') * feats_l[scale].size(2) / img_size[2] cost = self._build_volume_2d3(feats_l[scale], feats_r[scale], 3, wflow, stride=1) else: cost = self._build_volume_2d(feats_l[scale], feats_r[scale], 12, stride=1)

in wflow = F.upsample(pred[scale - 1], (feats_l[scale].size(2), feats_l[scale].size(3)),mode='bilinear') * feats_l[scale].size(2) / img_size[2] this line, it seems the previous pred (pred[scale - 1] )is already reshape to 256*512 with disp_up = F.upsample(pred_low_res, (img_size[2], img_size[3]), mode='bilinear') it's bigger than every cost in feats_l or feats_r, and it seems meant to be downsample. So why use the function upsample but not do conv? Thanks.

JiaRenChang commented 3 years ago

@heguohao0728 You could use deconv/conv to perform upsampling/downsample. We choose bilinear upsampling for reducing training parameter without performance drop.

heguohao0728 commented 3 years ago

Got it! Thank you so much.