Open JunjieLiuSWU opened 1 year ago
@JunjieLiuSWU Hi, I have the same question, have you done some experiments and may I ask what the result is?
In my opinion, make the backbone both aware for depth and detection is the best catch, but I also wonder what will happen if train them seperately.
I don't mean to use two backbones, still use one backbone. The original code is: img_feat_with_depth = depth.unsqueeze(1) * depth_feature[:, self.depth_channels:(self.depth_channels + self.output_channels)].unsqueeze(2)
I mean why not use detach() as this: img_feat_with_depth = depth.detach().unsqueeze(1) * depth_feature[:, self.depth_channels:(self.depth_channels + self.output_channels)].unsqueeze(2)
https://github.com/Megvii-BaseDetection/BEVDepth/blob/main/bevdepth/layers/backbones/base_lss_fpn.py#L527 Hello, Why not using detach() to depth when multiplying discrete depth with features, if depth is detached, only depth loss will be backward to depth net, and detection loss will not be backward to depth net, will the depth estimation be more acurrate?