issues
search
UoA-CARES
/
cares_reinforcement_learning
CARES Reinforcement Learning Package
11
stars
2
forks
source link
Dev/update sac to the paper
#112
Closed
qiaoting159753
closed
11 months ago
qiaoting159753
commented
11 months ago
Adjust the hyperparameters to the paper they reported at [
https://arxiv.org/pdf/1812.05905.pdf
] and [
https://proceedings.mlr.press/v80/haarnoja18b/haarnoja18b.pdf
].
A significant change in the Actor part has been introduced based on Appendix C at [
https://arxiv.org/pdf/1812.05905.pdf
]. A tanh transformer looks like it can boost an agent's performance. Empirically, it enhances rewards in most environments.
The code was found at the [Pytorch Benchmark]:
https://github.com/pytorch/benchmark/blob/7de2aeda4a8f62bd8d6777d9ce3f2962ccb6d1d1/torchbenchmark/models/soft_actor_critic/nets.py#L242
is a common practice. It is taught at Berkley's RL course cs225.