Open Xanyv opened 4 years ago
the code in tft_model.py self.multihead_attn = nn.MultiheadAttention(self.hidden_size, self.attn_heads)
self.multihead_attn = nn.MultiheadAttention(self.hidden_size, self.attn_heads)
so it used the original Multi-head attention in , not the Interpretable Multi-Head Attention in TFT paper?
the code in tft_model.py
self.multihead_attn = nn.MultiheadAttention(self.hidden_size, self.attn_heads)
so it used the original Multi-head attention in, not the Interpretable Multi-Head Attention in TFT paper?