MKorablyov / LambdaZero

4 stars 0 forks source link

RL Updates #129

Closed MJ10 closed 3 years ago

MJ10 commented 4 years ago

This PR adds quite a lot of RL based experiments. #107 was merged into this branch, so please merge that PR before merging this.

Overview of changes:

  1. Improve Persistent Buffer: Add support for similarity threshold, Random Episode Restarts, and support for graph-based env
  2. Random Network Distillation: Add implementation of RND on the graph-based environment
  3. PPO Additional Experiments: Added configuration for entropy regularization and environment parameters.
  4. AlphaZero: Update AlphaZero implementation and add support for policy optimization based improvements.