ethen8181 / machine-learning

:earth_americas: machine learning tutorials (mainly in Python3)
MIT License
3.17k stars 650 forks source link

it may be an error in torch transformer. #13

Closed orangetwo closed 3 years ago

orangetwo commented 3 years ago

class MultiHeadAttention(nn.Module):

in this class, it does not implement the scale of the multiplication of Query and Key. and in the forward function, it seems that the funcation should return linear_proj , not output?

ethen8181 commented 3 years ago

@orangetwo Thanks for spotting this. I've checked in the fix. Feel free to close this once you've confirmed it.