-
## 一言でいうと
LSTMに対する正則化と最適化方法を提案した研究。様々な手法を提案しているが、再帰(h_t-1)にかかる重みに対しDropConnectをかける手法は、CuDNNLSTMなど高速だがdropout非対応のセルの外側で使用できるため、速度と正則化を両立できる。PTB/WikiText2双方で顕著な効果を確認
![image](https://user-images.g…
-
## 一言でいうと
シンプルなLSTMを言語モデル用に限界までチューニングしてみるという研究。メインの工夫は、リカレントの接続にDropConnectを適用する+SGDで更新を行う際一定期間の平均を利用するASGDを、一定間隔の性能チェックで悪化していた場合に行うようにしたNT-ASGDの2点。
### 論文リンク
https://arxiv.org/abs/1708.02182…
-
I observed the training loss (for both long_los and icu_mortality tasks in the [MIMIC IV tutorial](https://meds-torch.readthedocs.io/en/latest/tutorial/)) is unstable and nonconverging for supervised …
-
No matter the size of the LSTM model, converting it with float16 optimization runs out of memory.
**Code to reproduce the issue**
[The code snippet to reproduce the issue on Google Colab](https://…
-
@tdhock this is the outline of my new paper, can you give me some feedbacks
# Learning Penalty Parameters for Optimal Partitioning via Automatic Feature Extraction
## Abstract
Changepoint detec…
-
## 🐛 Bug
## To Reproduce
Steps to reproduce the behavior:
1. Establish a PyTorch model with LSTM module using python, and store the script module after using torch.jit.trace. Python code …
-
**System information**
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow.js):
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 11
…
kzay updated
6 months ago
-
Something I've been thinking about with expansion of library: a decent amount of the work we've been using involves application of inductive biases and teacher-prompted training to model architecture.…
-
Hi,
are there any plans to add cuDNN-accelerated versions of LSTM and GRU to the PyTorch backend? Without cuDNN acceleration, the LSTM and GRU are considerably (several times) slower, even when run…
foxik updated
3 months ago
-
I found tf2 not support this feature