lumingzzz / TIC

[DCC 2022] Transformer-based Image Compression
https://NJUVISION.github.io/TIC
Apache License 2.0
20 stars 3 forks source link

Causual Attention Module #1

Closed xuezhongcailian closed 1 year ago

xuezhongcailian commented 1 year ago

hello author: How to understand the Causual Attention Module, I read the code of the Causual Attention Module , and I don't quite understand it, x_masked = x_unfold * self.mask.to(x_unfold.device) attn = (q @ k.transpose(-2, -1)) # BHW, num_heads, PP, PP Is this consistent with the picture below in the paper? image

lumingzzz commented 1 year ago

Sorry for the late reply. The CAM version in this repo is a slightly different from the paper in which we use the "unfold" operation to firstly get BxB (5x5) blocks and then the attention map is calculated within each block. The 0/1 mask is multiplied on the input instead of the attention map for causality.