add F.multi_head_attention_forward

rwth-i6 / pytorch-to-returnn

Make PyTorch code runnable within RETURNN

3 stars 6 forks source link

add F.multi_head_attention_forward #39

Closed vieting closed 3 years ago

vieting commented 3 years ago

Adds F.multi_head_attention_forward and a test case which is a simplified version of the MultiheadAttention module in fairseq.

albertz commented 3 years ago

This actually doesn't fix anything, except that it adds multi_head_attention_forward. So all worked fine now? (After the previous fixes.)

vieting commented 3 years ago

Yes, all the fixes that I needed to get this to work were handled in separate PRs before. Not sure what you mean with all worked fine now, but at least the test here works. It's a rather simple case though, I didn't test all the possibilities like e.g. using bias_k, bias_v, add_zero_attn.

albertz commented 3 years ago

but at least the test here works

Yes, that's what I mean. I assume it reflects what you actually need to go forward with Wav2Vec2 or other things.