Closed felixchalumeau closed 2 years ago
Merging #101 (d68a5a4) into develop (e58472e) will not change coverage. The diff coverage is
100.00%
.
@@ Coverage Diff @@
## develop #101 +/- ##
========================================
Coverage 90.72% 90.72%
========================================
Files 82 82
Lines 4572 4572
========================================
Hits 4148 4148
Misses 424 424
Impacted Files | Coverage Δ | |
---|---|---|
qdax/environments/pointmaze.py | 95.37% <100.00%> (ø) |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
Related to #37
All envs have an action space being a cartesian product of [-1, 1], except PointMaze that has [-0.1, 0.1]. This PR fixes this by moving the action space of PointMaze to [-1, 1]^2.
This has a small impact on the optimization process. Illustrated by the following plots.
Optim with MAP-Elites and the previous action space:
Optim with MAP-Elites and the new action space:
The slight difference in the optimization process arises because the previous implementation was feeding arbitrarily large actions (i.e. in the range of -1 and 1) where the action was then clipped to defined min max of the environment (i.e. -0.1 and 0.1). The new implementation standardizes the action space to force min max of the environment to be -1, 1 and then does the scaling to -0.1 and 0.1 internally within the environment. We can expect more similar results to the new implementation if we rescaled the outputs of the policy to -0.1 and 0.1 before passing it into the old environment implementation.