关于mix参数问题 - Githubissues

zhouhaoyi / Informer2020

The GitHub repository for the paper "Informer" accepted by AAAI 2021.

Apache License 2.0

5.27k stars 1.1k forks source link

关于mix参数问题 #550

Open qiyuxinlin opened 1 year ago

qiyuxinlin commented 1 year ago

作者您好，我在看您的代码时，有个地方很疑惑，在Encoder和Decoder中共有三层attention层，假设都采用您写的FullAttention模块，为什么Decoder中的第一个attention层要设置mix为True？这个参数我发现主要用于后面将多头合并成一个头的时候，将数据的最后两个维度交换，我不理解这里为什么要这么做，我在修改为False之后，效果变差了，想问下这个地方的原因是什么？

harrycoder28 commented 9 months ago

应该不是最后两个维度交换。而是中间两个维度交换。out的shape是batch, L, H, d_values, self.mix==True时，交换了L和H两个维度。