Can someone tell me why agents go beyond bounds when testing?

openai / maddpg

Code for the MADDPG algorithm from the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"

https://arxiv.org/pdf/1706.02275.pdf

MIT License

1.66k stars 494 forks source link

Open glong1997 opened 4 years ago

2197808908a commented 5 months ago

对，会越界，但是好像有奖励惩罚智能体，只能说环境还是有点问题