This pull request replaces large parts of the implementation with soft actor-critic code for better compatibility and easier maintenance. It also changes the way how the action bounds are enforced by replacing InputBounds with a squashing function (tanh).
This pull request replaces large parts of the implementation with soft actor-critic code for better compatibility and easier maintenance. It also changes the way how the action bounds are enforced by replacing
InputBounds
with a squashing function (tanh
).