question about norm_context

lucidrains / perceiver-pytorch

Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch

MIT License

1.1k stars 134 forks source link

question about norm_context #37

Open XiaoyuShi97 opened 3 years ago

XiaoyuShi97 commented 3 years ago

Hi, thx for sharing the code! I wonder what norm_context refer to in the paper? https://github.com/lucidrains/perceiver-pytorch/blob/3b70ebee00c66f15b38c5980f4275f744a433895/perceiver_pytorch/perceiver_io.py#L125

maximedb commented 3 years ago

See Annex C

In the cross-attention module, inputs are first processed with layer norm (Ba et al., 2016) before being passed through linear layers to produce each of the query, key, and value inputs to the QKV cross-attention operation.