michaelkoelle / marl-aquarium

Aquarium: A Comprehensive Framework for Exploring Predator-Prey Dynamics through Multi-Agent Reinforcement Learning Algorithms
MIT License
4 stars 4 forks source link

PPO Training Algorithm and Training Result #5

Open zxcvbnjs opened 5 months ago

zxcvbnjs commented 5 months ago

Dear Author,

I try to implement the PPO algorithm to replace the random policy in the given example but I find that the predator or prey only learns to go along a straight line rather than take a more flexible action. I might go to some wrong stages but I do not know how to fix the error and get some similar experimental results like your uploaded paper. So I want to ask whether you can release more experiment details and show how to implement the prey and predator algorithm to train the agent together or separately.

I sincerely appreciate your help and reply if it is possible!

Thank you very much!