Closed 51616 closed 3 years ago
Hi, SAC can be used with RNN as a representation model (or preprocessor model as you said) to tackle with POMDP problems just like recurrent DQN or PPO. But I am not sure if SAC can be combined with attention units.
SAC or most standard reinforcement learning algorithms are based on MDPs, assuming a state can fully represent an environment. But in POMDPs, RNN is only used to encode an observation to a near state, so SAC is still trained with states, not observations. I do not think this technique violate the derivation of SAC, and it can indeed achieve good performance in POMDPs.
@BlueFisher Thanks for the quick respond!
I'm wondering if SAC can be used with RNN or attention to process sequence of states and still work as expected. I have a few questions:
:)