Closed junhuang-ifast closed 4 years ago
Firstly, I would like to know why you want to use state-action pairs for DQN. DQN is a method for problems with small size discrete actions, so it is designed to predict all actions' values according to input states. Your approach (state-action input) is usually employed for the problems on continuous action space which is intractable to predict all state-actions' values.
Hi, I was thinking of incorporating the action (in addition to state) as a state-action pair input into the rainbow dqn model, however I am unsure of which part to insert it. Below code shows 4 places where I am thinking of adding the actions (as input to the model), but I am unsure if it is appropriate to add them there or not. (please see "<----" symbol)
I have seen state-action pair as input to the Q function of soft actor critic before, but not in DQN. So I am unsure if its logical to do this, especially in
self.dqn.dist(state_EDIT)
andselected_action = self.dqn(torch.FloatTensor(state_EDIT).to(self.device)).argmax()
.Any ideas on this? thanks :)