Created Deep Recurrent Q-Network example

This shows the way to implement Deep Recurrent Q-Network (DRQN) model for the Cartpole case. I had to expand the state input to include a few number of past state data and created a meaningful sequential input stream for Long and Short-Term Memory (LSTM) model. Otherwise, it did not work with just current state information. This sounds like violating the Markov property assumption but this does the job.

rlcode / reinforcement-learning

Created Deep Recurrent Q-Network example #85