[Question] LSTM and SAC - Am I understanding the docs correctly?

DJT777 commented 1 month ago

❓ Question

Hello!

I was reading the docs here https://stable-baselines3.readthedocs.io/en/master/modules/sac.html#notes and it notes that it does not accept recurrent policies. If I implement my own custom network into SAC then will it not run if I use an LSTM to encode the observation space before a forward pass to something like a MLP based critic or actor?

Basically just wondering if any use of an LSTM at all in a custom policy network would be supported or not. The docs are indicating that it's not, and I want to make sure: https://stable-baselines3.readthedocs.io/en/master/modules/sac.html#notes

Is that saying the SAC implementation won't support ANY recurrent neural networks or it doesn't support the available recurrent policies developed and available in the library.

Checklist

[X] I have checked that there is no similar issue in the repo
[X] I have read the documentation
[X] If code there is, it is minimal and working
[X] If code there is, it is formatted using the markdown code blocks for both code and stack traces.

DJT777 commented 1 month ago

@araffin If the question is duplicate can you point me in the direction to the discussion about implementation of LSTM into SAC?

araffin commented 1 month ago

Basically just wondering if any use of an LSTM at all in a custom policy network would be supported or not.

No lstm at all is supported currently, you would need to fork SB3 too (see related issues to have a starting point).

DJT777 commented 1 month ago

@araffin awesome, thank you!

DLR-RM / stable-baselines3

[Question] LSTM and SAC - Am I understanding the docs correctly? #1924

❓ Question

Checklist