-
## 🐛 Bug
Batch size is hardcoded when tracing a model using custom for loop with `nn.LSTMCell`. This makes it not possible to run model inference with different batch sizes.
## To Reproduce
…
-
#### Review
**1,训练模型**
- 参考[DeepQA](https://github.com/Conchylicultor/DeepQA)([训练中文语料](https://github.com/qhduan/Seq2Seq_Chatbot_QA))改进模型,使用之前提到的dgk_shooter_min.conv语料
- 使用参数:lr=0.0003,训练5epoch,b…
-
您将MAXPOOLing换成attention这里我看的有一些不太明白,不知道您可不可以帮我解释一下这部分的具体操作过程
-
I can only got f1-score of 0.5760, here is the confusion matrix from stdout output log
```
precision recall f1-score support
0 0.7017 0.8503 0.7689 1…
-
Hello
Thanks a lot for providing easy to understand tutorial and attention layer implementation.
I am trying to use attention on a dataset with different input and output length.
My training dat…
-
Hi,
Thanks for your great work.
I have a question that what is the difference between FC and Att2all?
Thanks.
-
## 一言でいうと
文の論理構造を捉えるのに適したモデルを調査した研究。タスクは文間の関係推定で、関係性を捉えないと答えられないよう意図的に合成したデータセットを使用している。CNNやTransformerはあまり良好な結果でなくTreeLSTMと提案モデル(構造を表す?Worldと名付けられた潜在表現を経由して推定する)が良好
### 論文リンク
https://arxiv.o…
-
Hi Zafarali,
I am trying to use your attention network to learn seq2seq machine translation with attention. My spurce lang output vocab is of size 32,000 and target vocab size 34,000. The following…
-
The authors [1] propose "fast weights", a type of attention mechanism to the recent past that performs multiple steps of computation between each hidden state computation step in an RNN. The authors e…
-
Hi
I was trying to run ' python a3c_main.py --evaluate 2 --load saved/pretrained_model' to run inference using the pre-trained model. However, I faced the following dimension error without changing…