Open Linusp opened 7 years ago
[论文领取示例] 这篇论文我领了
我已经阅读了 https://github.com/swarma/papernotes 中的说明文字, @Linusp 你的上一条评论 https://github.com/swarma/papernotes/issues/1#issuecomment-284202389 意思是不是说你打算阅读这篇论文并分享阅读笔记?
@pimgeek 是的,我正在修改参与方式说明,希望能更友好一些。能帮忙测试一个功能么?
麻烦看看你能不能点一下右上角的「Assignees」选中自己?
并不能,是灰色无法点选状态
@pimgeek ok,3q,我试试把你加到项目里来,你再看看可以不?
@pimgeek 我把你加到「论文阅读小组」里了,并且给这个 team 添加了对这个项目的读写权限,你再试试看?
可以了
@pimgeek 👍
数据集:
模型:
实验:
(图片可拖拽上传)
结论:
作者
发表时间
2015 年
摘要
Recurrent Neural Networks (RNNs), and specifically a variant with Long Short-Term Memory (LSTM), are enjoying renewed interest as a result of successful applications in a wide range of machine learning problems that involve sequential data. However, while LSTMs provide exceptional results in practice, the source of their performance and their limitations remain rather poorly understood. Using character-level language models as an interpretable testbed, we aim to bridge this gap by providing an analysis of their representations, predictions and error types. In particular, our experiments reveal the existence of interpretable cells that keep track of long-range dependencies such as line lengths, quotes and brackets. Moreover, our comparative analysis with finite horizon n-gram models traces the source of the LSTM improvements to long-range structural dependencies. Finally, we provide analysis of the remaining errors and suggests areas for further study.
推荐理由
LSTM 的优异性能已经得到了普遍的认可,但对于其内部机制仍然缺少足够的研究,本文通过对 RNN/LSTM/GRU 进行语言建模训练,分析了它们在相同问题上的差异,以及 LSTM/GRU 内部模块的细致分析,并对 LSTM 的错误做了分类,对理解 LSTM/GRU 的内在机制是一个非常好的参考。