RomDeffayet / DDPG_multi_agent

Deep Deterministic Policy Gradient (DDPG) in a multi-agent particle environment
MIT License
0 stars 1 forks source link

Benchmarking and visualizing the episodes #1

Open tulbahfh opened 1 year ago

tulbahfh commented 1 year ago

Hello Mr. Deffayet,

Thank you so much for this amazing repository! I'm currently using your project to better understand the implementation of DDPG. In trying to better understand this algorithm, I'm trying to utilize the benchmarking feature to try and quantify the events of each epsiodes. Could you kindly give me some directions on this please?

Thank you so much and have a wonderful day!

Sincerely,

Faris Tulbah

RomDeffayet commented 1 year ago

Hello Faris,

Although this is a very old project, and my coding skills were minimal at the time, I'm happy you are interested in it. Just so that there is no confusion, this repo contains code for multi-agent DDPG as presented in this paper, and whose official code is here. This is not the standard single-agent DDPG algorithm. If you're looking for a readable implementation of DDPG, I cannot recommend enough cleanrl.dev, which has an extensive doc and allows to benchmark algorithms in all sorts of ways !

As I said, this repo is quite old and poorly written, but if you wish to visualize an episode, you can use the render parameter of gym environments : python episode.py --n_episode 1 --render True should do what you want. To plot the learning curve, as on the readme, you can use functions from plot_results.py after having trained (a file name evolution is saved after training).

I hope this helps and you'll have fun exploring RL algorithms :)

tulbahfh commented 1 year ago

Got it! Thanks! Quick question, I was able to get the plot to work, but I wanted to understand the length of episode part of the code. What dictates the length of an episode? Is it when a predator catches a prey? Or is there another metric thanks!

RomDeffayet commented 1 year ago

Yes, if I remember correctly the episode finishes as soon as the predator catches the prey. The details of the environments are avalibale here: https://github.com/openai/multiagent-particle-envs

tulbahfh commented 1 year ago

Understood, thanks!