Closed sgtscu closed 2 years ago
Hi there, thanks for the interest! Looks like the decoder is not prominent to bring improvements, so we removed it to make the model smaller. But the decoder version will be sent to you to add to the understanding of the model.
Hi there, I found TransformerDecoderLayer is not used in the code repo, and I wonder why this is different from the Transformer Encoder-Decoder Classifier in the paper. Thanks!!!