rail-berkeley / rlkit

Collection of reinforcement learning algorithms
MIT License
2.5k stars 553 forks source link

Value network in TwinSAC #40

Closed jendelel closed 5 years ago

jendelel commented 5 years ago

Hi,

I skimmed over the author's implementation and it seems that they don't use the value network. Instead they only use the Q-networks. Seems they removed it in this commit

Thanks,

Lukas

vitchyr commented 5 years ago

Yes, this is a known difference. I've been updating the code quite a bit, and that's one of the changes made. I'm not ready to merge the new code into master just yet, but it's in the v0.2 branch that I just pushed. The SAC code has been tested. I'm not merging it into master yet is because I still need to do small things, like updating the documentation.

quanvuong commented 5 years ago

Hi,

When you mention the SAC code has been tested, do you mean that its performance is the same as the performance of the Tensorflow implementation?

Thanks!

vitchyr commented 5 years ago

Yes, when tested on ant, walker, hopper, humanoid, and half cheetah. This has now been merged into master.