gordicaleksa / pytorch-original-transformer

My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.
https://youtube.com/c/TheAIEpiphany
MIT License
983 stars 169 forks source link

Sorry, but I couldn't understand where is the concatenation layer after the multi head self attention, shouldn't there be? #8

Open Domics10 opened 10 months ago