Closed SivilTaram closed 6 years ago
Hi lanwu, thanks for your reproduction of DecAtt! I'm confused on the linear_layer_intra in _transformation_input. Is it defined anywhere else? I'd appreciate it if you could solve my problem, thanks a lot!
Hi Qian, the intra_attention part is for section 3.4 in original DecAtt paper, which is optional. In experiment, I didn't use this intra_attention.
Yeah I understand it, thanks for your response!
Hi lanwu, thanks for your reproduction of DecAtt! I'm confused on the linear_layer_intra in _transformation_input. Is it defined anywhere else? I'd appreciate it if you could solve my problem, thanks a lot!