marian-nmt / marian

Fast Neural Machine Translation in C++
https://marian-nmt.github.io
Other
1.21k stars 227 forks source link

I want to apply multi-query to marian,but “an illegal memory access was encountered” #382

Open Sarah-Callies opened 2 years ago

Sarah-Callies commented 2 years ago

Bug description

the url of the paper : https://www.researchgate.net/publication/337074940_Fast_Transformer_Decoding_One_Write-Head_is_All_You_Need

How to reproduce

Describe steps or include command to reproduce the behavior. when i slice the key and value in the attention layer:LayerAttention() -> MultiHead() the sliced key layer and value layer share the shape:[batch_size, 1, 10, 64] instead of the origian shape of multi-head:[batch_size, 8, 10, 64] the bug was encountered during backward in the function of z = z + mask mask shape: [128, 1, 1, 10] z shape: [128, 8, 10, 10]

did i explain it clearly? wish you help me to find the reason regards