wzzheng / TPVFormer

[CVPR 2023] An academic alternative to Tesla's occupancy network for autonomous driving.
https://wzzheng.net/TPVFormer/
Apache License 2.0
1.16k stars 105 forks source link

Question about Cross-view Hybrid attention #29

Open jianingwangind opened 1 year ago

jianingwangind commented 1 year ago

Thanks for sharing the great work.

Regarding to Cross-view Hybrid attention, is it only apllied for the HW top plane? https://github.com/wzzheng/TPVFormer/blob/207358903342962df1b69cd7da27301241df39dc/tpvformer04/modules/tpvformer_layer.py#L172

The query is itself, key and value are both none while later in cross-view hybrid attention the value is set to be the concatenation of queries https://github.com/wzzheng/TPVFormer/blob/207358903342962df1b69cd7da27301241df39dc/tpvformer04/modules/cross_view_hybrid_attention.py#L163

yuhanglu2000 commented 1 year ago

I have the same question, it looks like there is no interaction between the features of the three planes.

huang-yh commented 1 year ago

Thanks for your interest in our work. Your understanding of the code is correct. That is, in TPVFormer04, cross-view hybrid attention is enabled only in the HW plane, thus degrading to self-attention.

jianingwangind commented 1 year ago

@huang-yh Thanks for your reply. May i further ask the idea behind this? Similar performance when disabling the attention in the other two planes?