Remtasya / Distributional-Multi-Agent-Actor-Critic-Reinforcement-Learning-MADDPG-Tennis-Environment

The state-of-the-art in multi-agent Reinforcement Learning is the MADDPG algorithm which utilises DDPG actor-critic neural networks where each agent uses centralized critic training but decentralized actor execution, and is capable of learning either cooperative or competitive environments. This is demonstrated on the Unity Tennis Environment.
23 stars 11 forks source link

train_agent.ipynb #2

Open JingdiC opened 3 years ago

JingdiC commented 3 years ago

Hi , I run your train_agent file, but I only got 0.01 average reward after 3000 episodes run. Did not get the same results you said. I don't know why.

Remtasya commented 7 months ago

Hi JingdiC - Multi-agent reinforcement learning is highly unstable and exhibits large variation between runs and chaotic feed-back loops, and often fails through random variation to get a good initialisation. Try running it multiple times (e.g. 5-10 times) with different initialisation seeds and saving the run that performs the best.