MishaLaskin / rad

RAD: Reinforcement Learning with Augmented Data
400 stars 71 forks source link

hyperparameters for state SAC #11

Closed kyonofx closed 3 years ago

kyonofx commented 3 years ago

Thank you for the wonderful work and repo!

Could you share your hyperparameters or your reference for state SAC in dm_control and OpenAI Gym environments? Thank you very much!

HosseinSheikhi commented 3 years ago

Thank you for the wonderful work and repo!

Could you share your hyperparameters or your reference for state SAC in dm_control and OpenAI Gym environments? Thank you very much!

would you please share the repo of pure SAC implementation you are using. I cant find ready to go, pure SAC compatible with dm_control.

kyonofx commented 3 years ago

you can use this repo and set --encoder_type to something other than 'pixel'

HosseinSheikhi commented 3 years ago

Change to the 'pixel' will lead to creating state-based agents: https://github.com/MishaLaskin/rad/blob/1246bfd6e716669126e12c1f02f393801e1692c1/train.py#L197

By the way encoder type could be either 'pixel' or 'identity': https://github.com/MishaLaskin/rad/blob/1246bfd6e716669126e12c1f02f393801e1692c1/encoder.py#L125

am not quite sure by maybe eliminating this line will lead to the SAC: https://github.com/MishaLaskin/rad/blob/1246bfd6e716669126e12c1f02f393801e1692c1/curl_sac.py#L362 any idea?

MishaLaskin commented 3 years ago

Yes, you need to change the encoder type to 'identity'

On Wed, Dec 23, 2020 at 2:09 AM Hossein Sheikhi Darani < notifications@github.com> wrote:

Change to the 'pixel' will lead to creating state-based agents:

https://github.com/MishaLaskin/rad/blob/1246bfd6e716669126e12c1f02f393801e1692c1/train.py#L197

By the way encoder type could be either 'pixel' or 'identity':

https://github.com/MishaLaskin/rad/blob/1246bfd6e716669126e12c1f02f393801e1692c1/encoder.py#L125

am not quite sure by maybe eliminating this line will lead to the SAC:

https://github.com/MishaLaskin/rad/blob/1246bfd6e716669126e12c1f02f393801e1692c1/curl_sac.py#L362 any idea?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MishaLaskin/rad/issues/11#issuecomment-749978618, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHWQWP5D6YJXE3DOVCW4F3SWGJTVANCNFSM4TZSGJAA .