cogment / cogment-verse

Research platform for Human-in-the-loop learning (HILL) & Multi-Agent Reinforcement Learning (MARL)
https://cogment.ai/cogment_verse
Apache License 2.0
76 stars 14 forks source link

Async PPO for Mujoco #181

Closed lhnguyen102 closed 1 year ago

lhnguyen102 commented 1 year ago

Description

This pull request implements Async PPO (Proximal Policy Optimization) algorithm for Mujoco environment. The Async PPO algorithm leverage Cogment's microservices architecture to improve training efficiency and stability by running multiple agents in parallel.

Changes Made

Related Issue

close #177

Steps to Test

  1. run the following command on the local machine python -m main +experiment=appo/hopper or
  2. run notebook for Sagemaker in ./cloud/sagemaker_trainer.ipynb

Notes for Reviewers