Denys88 / rl_games

RL implementations
MIT License
851 stars 143 forks source link

RNN for Experience Replay implemented? #141

Closed 1tac11 closed 2 years ago

1tac11 commented 2 years ago

Hi there,

I was just wondering whether the RNN Experience Replay is implemented right? The reason is that in play_steps_rnn() update_data() is called but not update_data_rnn(). Specifically for replaying experiences in RNN a whole seq_len would have to be replayed for the GRU or LSTM respectively to deliver right results, right? Or is the current state of the GRU/LSTM cells also stored in the replay buffer in each step?

Or maybe I haven't fully understood these concepts in RL yet.

Kind regards

Denys88 commented 2 years ago

HI @1seck , previously I had another approach ( maybe in 1.2.0 rl_games) where I needed to do special update for rnn. Right now I don't need it. 'update_data_rnn' will be removed. For the saved cells: Imagine you play 32 steps every epoch. And your seq_len is 8. It means I'll store 32//8 = 4 memory cells for each env. And I will be able to process them in parallel. If seq_len is equal to the horizon_length there memory will be stored only once.

I recommend you to play with simple test rnn env to feel how it works: https://github.com/Denys88/rl_games/blob/master/rl_games/envs/test/rnn_env.py On the first frame I show coordinates where you need to go and then remove this information. Here is example config: https://github.com/Denys88/rl_games/blob/master/rl_games/configs/test/test_rnn_multidiscrete.yaml you can play with maximum distance or additional reward and other parameters.

1tac11 commented 2 years ago

thx for the fast response, I will try that :)