Update the action space of PointMaze to be [-1, 1]^2

adaptive-intelligent-robotics / QDax

Accelerated Quality-Diversity

MIT License

266 stars 45 forks source link

Related to #37

All envs have an action space being a cartesian product of [-1, 1], except PointMaze that has [-0.1, 0.1]. This PR fixes this by moving the action space of PointMaze to [-1, 1]^2.

This has a small impact on the optimization process. Illustrated by the following plots.

Optim with MAP-Elites and the previous action space: map-elites-ptmaze-old-action-space

Optim with MAP-Elites and the new action space: map-elites-ptmaze-new-action-space

The slight difference in the optimization process arises because the previous implementation was feeding arbitrarily large actions (i.e. in the range of -1 and 1) where the action was then clipped to defined min max of the environment (i.e. -0.1 and 0.1). The new implementation standardizes the action space to force min max of the environment to be -1, 1 and then does the scaling to -0.1 and 0.1 internally within the environment. We can expect more similar results to the new implementation if we rescaled the outputs of the policy to -0.1 and 0.1 before passing it into the old environment implementation.

Codecov Report

Merging #101 (d68a5a4) into develop (e58472e) will not change coverage. The diff coverage is 100.00%.

@@ Coverage Diff @@ ## develop #101 +/- ## ======================================== Coverage 90.72% 90.72% ======================================== Files 82 82 Lines 4572 4572 ======================================== Hits 4148 4148 Misses 424 424

Impacted Files	Coverage Δ
qdax/environments/pointmaze.py	`95.37% <100.00%> (ø)`

Impacted Files

Coverage Δ

qdax/environments/pointmaze.py

95.37% <100.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

adaptive-intelligent-robotics / QDax

Update the action space of PointMaze to be [-1, 1]^2 #101

Codecov Report