issues
search
Kyubyong
/
transformer
A TensorFlow Implementation of the Transformer: Attention Is All You Need
Apache License 2.0
4.28k
stars
1.3k
forks
source link
Why 6 blocks of multi-head attention in the encoder and decoder?
#129
Open
xus-stack
opened
5 years ago
xus-stack
commented
5 years ago
Does this proved to enhance performance?
Does this proved to enhance performance?