IntelLabs / coach

Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms
https://intellabs.github.io/coach/
Apache License 2.0
2.32k stars 461 forks source link

TD3 #338

Closed gal-leibovich closed 5 years ago

gal-leibovich commented 5 years ago

Ready for review - reproducing and even surpassing paper results.

Note: Slight renaming of the noise_percentage_schedule parameter resulted in some changes in unrelated files.

Also, some small fixes to DDPG.

gal-leibovich commented 5 years ago

The TD3 implementation in this repo significantly outperforms the paper published results (although Coach uses v2 environments and the official TD3 repo uses v1 environments, the performance between the two should be similar, e.g. as stated and shown here.

Env TD3 (Coach) TD3 (official repo)
Hopper 3628+- 80 3564 +- 115
Walker2D 4766 +- 800 4682 +- 540
Reacher -3.468 +- 0.35 -3.6 +- 0.56
Half Cheetah 10888 +- 1250 9637 +- 859
Ant 5090 +- 1000 4372 +- 1000

Comparison of TD3 (called FixedCriticInputBugAndBootstrapBug) to DDPG (no batchnorm)

Hopper image

Walker2D image

Reacher image

Half Cheetah image

Ant image