Closed wuyaoxuehun closed 2 years ago
Thanks for pointing out this! The mention in our paper may not fully illustrate the fully implementation. The propose adding [CLS] hidden states to every step has two considerations:
Thanks for pointing out this! The mention in our paper may not fully illustrate the fully implementation. The propose adding [CLS] hidden states to every step has two considerations:
- Act as the beginning-of-sentence input embedding
- Compensate for the missing source memory Because here we mainly pre-train the h_i rather than decoder itself, this way could be more efficient in pre-training.
Fair enough. Nice work. Many thanks for you reply!
https://github.com/Tribleave/SCAPT-ABSA/blob/5f341fd811af62e7c0c8c8417c3a89f45179d663/model/module/misc.py#L27
you add the encoder's [CLS] hidden states to every input word embeddings as transformer decoder's input, which may be not consistent with your claim "hi acts as a beginning- of-sentence input embedding in the decoding process to control the whole generation" Thanks for you reply in advance!