BlueFisher / Advanced-Soft-Actor-Critic

Soft Actor-Critic with advanced features
47 stars 5 forks source link

Is this the first repo to implement RNN-based SAC? #3

Closed 51616 closed 3 years ago

51616 commented 3 years ago

I'm wondering if SAC can be used with RNN or attention to process sequence of states and still work as expected. I have a few questions:

  1. Do you have any result using the RNN as the preprocessor model?
  2. Does using RNN violate any assumption of the derivation of SAC?

:)

BlueFisher commented 3 years ago

Hi, SAC can be used with RNN as a representation model (or preprocessor model as you said) to tackle with POMDP problems just like recurrent DQN or PPO. But I am not sure if SAC can be combined with attention units.

SAC or most standard reinforcement learning algorithms are based on MDPs, assuming a state can fully represent an environment. But in POMDPs, RNN is only used to encode an observation to a near state, so SAC is still trained with states, not observations. I do not think this technique violate the derivation of SAC, and it can indeed achieve good performance in POMDPs.

51616 commented 3 years ago

@BlueFisher Thanks for the quick respond!