Open codefish1990 opened 3 years ago
Question:multihead_attention()’s output and ff()‘s output need dropout?
Question:multihead_attention()’s output and ff()‘s output need dropout?