Denys88 / rl_games

RL implementations
MIT License
930 stars 155 forks source link

How to Correctly Integrate LSTM or GRU into the SAC Algorithm #270

Open namjiwon1023 opened 10 months ago

namjiwon1023 commented 10 months ago

I have referred to some people's work on adding RNNs to reinforcement learning algorithms, but strangely, almost everyone's code implementation is different. So I would like to ask how you integrate LSTM or GRU into the SAC algorithm.

In my implementation, I have incorporated LSTM into both the actor and critic networks. The image below shows the LSTM added to the actor network.

image image

And during training, I initialize the hidden state input of the LSTM.

image image

I also initialize the input hidden state when the environment is reset. image

I would like to ask if my method of adding this is correct. How did you incorporate RNN into SAC when you did it?

Thank you, I look forward to your reply.

Denys88 commented 10 months ago

Do you aware of any reference implementations? There are couple of ways how it can be done. Problem tht in PPO I am reusing old hidden state from previous step but in SAC you can have very old sequences so you cannot reuse hidden state.