Open shagharabbani opened 1 year ago
Hi,
Would it be possible to apply masking only in the decoder single head attention? I think we have masking in both MHA and SHA in the decoder.
Best, Shaghayegh
Hi @shagharabbani, I think this would definitely be possible but is currently not implemented, also I'm not completely sure why you'd want that but feel free to try it!
Hi,
Would it be possible to apply masking only in the decoder single head attention? I think we have masking in both MHA and SHA in the decoder.
Best, Shaghayegh