Open howardyclo opened 6 years ago
This paper purposes a Deep Reinforcement Relevance Network (DQN-based) learn to understand the texts from both text-based state and action (learn the "relevance" between state and action) via playing a text-based game. The objective is to navigating through the sequence of texts (input state and output action) so as to maximize the long-term reward in the game (using Q-learning).
Narasimhan et al. use LSTM to characterize the state space in a DQN framework for learning control policies for parser-based text games. They show that more complex sentence representation can give further improvements. However, this paper tried LSTM in "“Machine of Death" and did not improve. This may due to the scale of the task or the embeddings need to be trained on large dataset.
The vocabulary size of both state space and action space are relatively smaller than many NLP tasks. Wonder if it can still perform well when the vocabulary size is large.
Metadata