jendelel / rhl-algs

Experiments for Human in Reinforcement Learning
Apache License 2.0
2 stars 0 forks source link

reproducibility of sac for halfcheetah #2

Open tldoan opened 5 years ago

tldoan commented 5 years ago

Hi,

thanks for releasing sac code. I was wondering if you could reproduce the results of sac for HalfCheetah-v2 (10,000 around 1M) I used code from this github too https://github.com/pranz24/pytorch-soft-actor-critic (with the value network removed) but so far doesn't look like the pattern of 10,000.

Thank you very much

jendelel commented 5 years ago

Hi, thanks a lot for the interest in my repository. I reimplemented all the algorithms to get better understanding of them. Unfortunately, I also noticed that after some steps, the Half-Cheetah does not perform as it does with the original author's repository.

I am sorry I don't have the time right now to find the bug in the code. I suggest using the author's repository. It's very well structured. I prefer pyTorch over TF but didn't want to rewrite their whole repository. If you insist on pytorch, they recommend this repo, which is also maintained by one of the Berkley's students.

Lastly, today I read on Twitter about lagom. The learning curves seemed correct and they also have SAC implementation in pytorch.

Sorry I can't help you more at this moment.

Lukas

tldoan commented 5 years ago

HI,

sorry my bad it works ! I tried with code from this repo https://github.com/pranz24/pytorch-soft-actor-critic/tree/old

(I saw your little chat )

it hits 10,000 around 500,000 steps with and without the V network.

Thanks !

Le dim. 19 mai 2019 à 04:30, Lukas Jendele notifications@github.com a écrit :

Hi, thanks a lot for the interest in my repository. I reimplemented all the algorithms to get better understanding of them. Unfortunately, I also noticed that after some steps, the Half-Cheetah does not perform as it does with the original author's repository.

I am sorry I don't have the time right now to find the bug in the code. I suggest using the author's repository https://github.com/rail-berkeley/softlearning. It's very well structured. I prefer pyTorch over TF but didn't want to rewrite their whole repository. If you insist on pytorch, they recommend this repo https://github.com/vitchyr/rlkit, which is also maintained by one of the Berkley's students.

Lastly, today I read on Twitter about lagom https://github.com/zuoxingdong/lagom. The learning curves seemed correct and they also have SAC implementation in pytorch.

Sorry I can't help you more at this moment.

Lukas

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jendelel/rhl-algs/issues/2?email_source=notifications&email_token=ACDK76GVZDRGR4J3PIWAWVTPWEFZVA5CNFSM4HN2UWT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVW5M7A#issuecomment-493737596, or mute the thread https://github.com/notifications/unsubscribe-auth/ACDK76EQHNNKAENNZICMGCTPWEFZVANCNFSM4HN2UWTQ .