This is a fast-converging task, so it's good to serve as a environment to testing our algorithms.
It takes 6-7 minutes to solve the task with distance based reward shaping, and takes 1h15 minutes if without reward shaping.
I also accidentally(by a bug) did a 2D version experiment, which takes 4-5 minutes to solve the task with reward shaping, and takes 30 minutes if without reward shaping.
Changes:
This is a fast-converging task, so it's good to serve as a environment to testing our algorithms.
It takes 6-7 minutes to solve the task with distance based reward shaping, and takes 1h15 minutes if without reward shaping.
I also accidentally(by a bug) did a 2D version experiment, which takes 4-5 minutes to solve the task with reward shaping, and takes 30 minutes if without reward shaping.