Why can't memory networks read effectively?

https://arxiv.org/abs/1910.07350

https://paperswithcode.com/paper/why-cant-memory-networks-read-effectively

Simon Šuster, Madhumita Sushil, Walter Daelemans

University of Antwerp

Memory networks have been a popular choice among neural architectures for machine reading comprehension and question answering. While recent work revealed that memory networks can't truly perform multi-hop reasoning, we show in the present paper that vanilla memory networks are ineffective even in single-hop reading comprehension. We analyze the reasons for this on two cloze-style datasets, one from the medical domain and another including children's fiction. We find that the output classification layer with entity-specific weights, and the aggregation of passage information with relatively flat attention distributions are the most important contributors to poor results. We propose network adaptations that can serve as simple remedies. We also find that the presence of unseen answers at test time can dramatically affect the reported results, so we suggest controlling for this factor during evaluation.

メモリネットワークは、機械の読解と質問応答のためのニューラルアーキテクチャの間で人気のある選択肢です。最近の研究により、メモリネットワークはマルチホップ推論を真に実行できないことが明らかになりましたが、本論文では、バニラメモリネットワークはシングルホップ読解でも効果がないことを示しています。この理由を2つのクローズスタイルのデータセットで分析します。1つは医療分野からのもので、もう1つは児童文学を含むものです。エンティティ固有の重みを持つ出力分類レイヤー、および比較的フラットな注意分布を持つパッセージ情報の集約が、悪い結果の最も重要な要因であることがわかります。簡単な救済策として役立つネットワーク適応を提案します。

morioka / reading

Why can't memory networks read effectively? #2