Closed jendelel closed 5 years ago
Yes, this is a known difference. I've been updating the code quite a bit, and that's one of the changes made. I'm not ready to merge the new code into master just yet, but it's in the v0.2
branch that I just pushed. The SAC code has been tested. I'm not merging it into master yet is because I still need to do small things, like updating the documentation.
Hi,
When you mention the SAC code has been tested, do you mean that its performance is the same as the performance of the Tensorflow implementation?
Thanks!
Yes, when tested on ant, walker, hopper, humanoid, and half cheetah. This has now been merged into master.
Hi,
I skimmed over the author's implementation and it seems that they don't use the value network. Instead they only use the Q-networks. Seems they removed it in this commit
Thanks,
Lukas