howardyclo / papernotes

My personal notes and surveys on DL, CV and NLP papers.
128 stars 6 forks source link

Deep Reinforcement Learning with a Natural Language Action Space #18

Open howardyclo opened 6 years ago

howardyclo commented 6 years ago

Metadata

howardyclo commented 6 years ago

Summary

This paper purposes a Deep Reinforcement Relevance Network (DQN-based) learn to understand the texts from both text-based state and action (learn the "relevance" between state and action) via playing a text-based game. The objective is to navigating through the sequence of texts (input state and output action) so as to maximize the long-term reward in the game (using Q-learning).

Text-based Games

Figure 3

Deep Reinforcement Relevance Network (DRRN)

Figure 1

Dataset Statistics

Table 1

Result

Table 4

Related Work

Narasimhan et al. use LSTM to characterize the state space in a DQN framework for learning control policies for parser-based text games. They show that more complex sentence representation can give further improvements. However, this paper tried LSTM in "“Machine of Death" and did not improve. This may due to the scale of the task or the embeddings need to be trained on large dataset.

Personal Thoughts

The vocabulary size of both state space and action space are relatively smaller than many NLP tasks. Wonder if it can still perform well when the vocabulary size is large.

References