rlcode / reinforcement-learning

Minimal and Clean Reinforcement Learning Examples
MIT License
3.35k stars 725 forks source link

Created Deep Recurrent Q-Network example #85

Open Douglas-Cho opened 5 years ago

Douglas-Cho commented 5 years ago

This shows the way to implement Deep Recurrent Q-Network (DRQN) model for the Cartpole case. I had to expand the state input to include a few number of past state data and created a meaningful sequential input stream for Long and Short-Term Memory (LSTM) model. Otherwise, it did not work with just current state information. This sounds like violating the Markov property assumption but this does the job.