Closed murphp15 closed 6 years ago
I know we used them in deep_qa, the reading comprehension-focused predecessor to AllenNLP, but I don't think they did very well. @matt-gardner and @DeNeutoy can probably speak in more detail?
I'm not a huge fan of the models called "memory networks" - in general they are too tuned to a completely artificial task, and they don't work well on real data. I implemented the "end-to-end memory network", for instance, and it has three separate embedding layers (which is absolutely absurd if you want to apply it to real data). @DeNeutoy implemented the DMN+. It's not as egregious as the E2EMN, but still, I'd look at actual papers, not blogs, when deciding what methods actually work. E.g., are there any memory networks on the SQuAD leaderboard? On the TriviaQA leaderboard? On the leaderboard of any recent, popular dataset? To be fair, more recent "memory networks" have modified their architectures so they're a lot more similar to things like the gated attention reader, which has actually performed well on real data. But, it sure seems like no one is using them to accomplish state-of-the-art QA on real data these days.
In looking at blogs online I see alot about dynamic memory networks. Why do you guys never use them? https://www.quora.com/What-are-dynamic-memory-networks