DerryHub / BEVFormer_tensorrt

BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).
Apache License 2.0
430 stars 71 forks source link

What's the purpose of prod for bev_mask #89

Open Jian-danai opened 11 months ago

Jian-danai commented 11 months ago

https://github.com/DerryHub/BEVFormer_tensorrt/blob/303d3140c14016047c07f9db73312af364f0dd7c/det2trt/models/modules/encoder.py#L256C10-L258C84

bev_mask = (1 - (1 - bev_mask).prod(0)).view(6, -1, 1)
bev_mask = bev_mask / torch.clamp(bev_mask.sum(0, keepdims=True), min=1e-4)

What's the purpose of this, it makes bev_mask different from original point_sampling function.

Where did you handle those difference?[It seems bev_mask will not be used later, so it does not matter?]

Another question, https://github.com/DerryHub/BEVFormer_tensorrt/blob/303d3140c14016047c07f9db73312af364f0dd7c/det2trt/models/modules/encoder.py#L297

shift_ref_2d = ref_2d.clone()
shift_ref_2d = shift_ref_2d + shift.view(1, 1, 1, 2) * use_prev_bev

If use clone() there, it will not match original BEVformer, right? Original ref_2d will change when shift_ref_2d changing. Besides, Original shift_ref_2d always add shift.view(1, 1, 1, 2) even if in the first sample(i.e. when use_prev_bev=0).