关于Transformer模块的一些问题。 - Githubissues

caiyuanhao1998 / Retinexformer

"Retinexformer: One-stage Retinex-based Transformer for Low-light Image Enhancement" (ICCV 2023) & (NTIRE 2024 Challenge)

https://arxiv.org/abs/2303.06705

MIT License

920 stars 81 forks source link

关于Transformer模块的一些问题。 #105

Closed Shecyy closed 2 months ago

Shecyy commented 2 months ago

作者您好，注意到您代码里面，将Transformer模块的V作为了位置编码，并且舍弃了self-attention前面的LN操作，保留了FFN前面的LN操作。请问位置编码的用途是什么哇？以及为什么要舍弃attention前面的LN。

caiyuanhao1998 commented 2 months ago

因为 V 代表 value，我们在代表 feature 的转化值上做位置编码更能提示 token 的位置信息，比较 make sense。

Self-attention 的 LN 操作并没被放弃，而是在这个位置：

https://github.com/caiyuanhao1998/Retinexformer/blob/master/basicsr/models/archs/RetinexFormer_arch.py#L166 https://github.com/caiyuanhao1998/Retinexformer/blob/master/basicsr/models/archs/RetinexFormer_arch.py#L167

如果觉得我们的 repo 有用的话，帮忙点点 star 支持一下，感谢 :)