huawei-noah / SMARTS

Scalable Multi-Agent RL Training School for Autonomous Driving
MIT License
909 stars 184 forks source link

如何使用除了PPO算法以外算法训练 #2147

Closed Ppig01 closed 2 months ago

Ppig01 commented 3 months ago

High Level Description




Operating System

ubuntu 20.04


No response

Adaickalavan commented 2 months ago

Hi @Ppig01

1) To use your own customized training code, it is best to start by familiarising yourself on how to interact with the multi-agent SMARTS environment. 2) A multi-agent SMARTS example is given here. 3) Given observations, rewards, terminateds, and truncateds, which are the outputs of env.step(actions), you can train your own policy to yield the next actions, thereby replacing the default RandomLanerAgent policy used in the example.