OpenGVLab / UniFormerV2

[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
https://arxiv.org/abs/2211.09552
Apache License 2.0
294 stars 19 forks source link

为什么第一个模块的命名为Local_MHRA或者说是Local MHRA Temporal,代码不是的Depthwise 3D Convolution吗? #57

Closed zxin4506 closed 11 months ago

zxin4506 commented 11 months ago

为什么第一个模块的命名为Local_MHRA或者说是Local MHRA Temporal,代码不是的Depthwise 3D Convolution吗?我的理解MHRA是MultiHeadResidualAttention的意思

Andy1621 commented 11 months ago

MHRA的可以参考UniFormerV1,V2中沿用了V1的概念,Multi-Head Relation Aggregator