-
If I want to use the attention rnn with this lib.
Is there anything examples or guides for building a attention rnn?
D-X-Y updated
7 years ago
-
为啥大家代码都是只在计算attention权重的时候对填充token做mask,送入bi-lstm还有最后的pooling层都不处理填充token?
-
https://arxiv.org/abs/1805.02474
双方向LSTM(Bi-LSTM)が持つ問題を、各wordに対してParallel state 状態を保持することで解決しようとした
# 1.Introduction
bi-directional LSTM自体はもうメジャーになっている
Lanuage modelにおいて、かつてSOTAを達成したことがあるくらいにメジャー…
-
I want to understand how you used attention in NEr task, any paper or article which explains this? Thanks
-
**orginal code:**
```
def forward(self, document_batch: torch.Tensor, device='cpu', bert_batch_size=0):
**bert_output = torch.zeros(size=(document_batch.shape[0],
…
-
请问一下关于rnn中kmax_pooling的用法目前用的多吗,如果不进行这步操作,直接在out = self.bilstm(embed)[0].permute(1, 2, 0)这一步中直接取最后一个时间步?
atnlp updated
5 years ago
-
Hi this is MJ - I've been attempting to load my custom trained model (that I know outputs melodies and training works based on my manual method) to respond to the MagentaJS tutorial. When I run throug…
-
By overfitting a hierarchical network (word-level + sentence-level embeddings; most likely LSTM/RNN architecture), we can potentially overfit on a single sample, by attempting to generate the summary …
-
It complains a key error
"KeyError: num_residual_layers"
Here is my script
python -m nmt.nmt \
--src=en --tgt=de \
--vocab_prefix=${DATA_DIR}/vocab \
--train_prefix=${DATA_DIR}/…
-
1.is there an attention on Intent Detection task in your code? you seem to just make a full-connected layer after Bi-LSTM. That confused me.
2.why the final intent vector is encoder_final_state_h…