In the README.md, it suggests using the torch of version 1.3.0, but there seems no that version in the previous version of PyTorch, link.
So, I use the latest version (1.7.1) of the torch, and when I start training, I got this Runtime Error.
And then I found that the error was caused in the prophetnet/ngram_multihead_attention.py line 255.
q = q * self.scaling
It looks like this operation is not allowed anymore, then I fixed the problem by the following:
q_ = q * self.scaling
if self.bias_k is not None:
assert self.bias_v is not None
k = torch.cat([k, self.bias_k.repeat(1, bsz, 1)])
v = torch.cat([v, self.bias_v.repeat(1, bsz, 1)])
q = q_.contiguous().view(tgt_len, bsz * self.num_heads, self.head_dim).transpose(0, 1)
In the README.md, it suggests using the torch of version
1.3.0
, but there seems no that version in the previous version of PyTorch, link.So, I use the latest version (
1.7.1
) of the torch, and when I start training, I got this Runtime Error. And then I found that the error was caused in theprophetnet/ngram_multihead_attention.py
line 255.It looks like this operation is not allowed anymore, then I fixed the problem by the following: