I cannot reproduce the result

hanruihua / rl_rvo_nav

The source code of the [RA-L] paper "Reinforcement Learned Distributed Multi-Robot Navigation with Reciprocal Velocity Obstacle Shaped Rewards"

MIT License

180 stars 32 forks source link

I cannot reproduce the result #16

Closed lukeprotag closed 5 months ago

lukeprotag commented 7 months ago

I can only obtain results in 4 robot settings, but when using weights to train 10 robots, I cannot obtain good results.

hanruihua commented 7 months ago

you can try to adjust the coefficients in the reward function, which is important for the training outcome

jijiking-hh commented 6 months ago

Hello, I also have this question. How can 10 agents set reward functions in a dynamic environment? May I ask if you can demonstrate it

lukeprotag commented 6 months ago

Hello, I also have this question. How can 10 agents set reward functions in a dynamic environment? May I ask if you can demonstrate it

In train_process.py, I found : par_env.add_argument('--reward_parameter', type=float, default=(3.0, 0.3, 0.0, 6.0, 0.3, 3.0, -0, 0), nargs='+')

jijiking-hh commented 6 months ago

Thank you，actually I have also noticed this, but the correspondence between these parameters and the reward function, their actual significance, and the basis for adjustment are still unclear