How to Correctly Integrate LSTM or GRU into the SAC Algorithm

I have referred to some people's work on adding RNNs to reinforcement learning algorithms, but strangely, almost everyone's code implementation is different. So I would like to ask how you integrate LSTM or GRU into the SAC algorithm.

In my implementation, I have incorporated LSTM into both the actor and critic networks. The image below shows the LSTM added to the actor network.

And during training, I initialize the hidden state input of the LSTM.

I also initialize the input hidden state when the environment is reset.

I would like to ask if my method of adding this is correct. How did you incorporate RNN into SAC when you did it?

Thank you, I look forward to your reply.

Denys88 / rl_games

How to Correctly Integrate LSTM or GRU into the SAC Algorithm #270