Denys88 / rl_games

RL implementations
MIT License
800 stars 135 forks source link

How to Correctly Integrate LSTM or GRU into the SAC Algorithm #270

Open namjiwon1023 opened 5 months ago

namjiwon1023 commented 5 months ago

I have referred to some people's work on adding RNNs to reinforcement learning algorithms, but strangely, almost everyone's code implementation is different. So I would like to ask how you integrate LSTM or GRU into the SAC algorithm.

In my implementation, I have incorporated LSTM into both the actor and critic networks. The image below shows the LSTM added to the actor network.

image image

And during training, I initialize the hidden state input of the LSTM.

image image

I also initialize the input hidden state when the environment is reset. image

I would like to ask if my method of adding this is correct. How did you incorporate RNN into SAC when you did it?

Thank you, I look forward to your reply.

Denys88 commented 5 months ago

Do you aware of any reference implementations? There are couple of ways how it can be done. Problem tht in PPO I am reusing old hidden state from previous step but in SAC you can have very old sequences so you cannot reuse hidden state.