Closed QMBX closed 5 months ago
Hi @QMBX,
You can use the --num_obstacles
flag.
We created a test submission bash script where we randomly chose the number of obstacles.
python -u onpolicy/scripts/train_mpe.py --use_valuenorm --use_popart \
--project_name "informarl" \
--env_name "GraphMPE" \
--algorithm_name "rmappo" \
--seed 0 \
--experiment_name "informarl" \
--scenario_name "navigation_graph" \
--num_agents 3 \
--num_obstacles $((1 + $RANDOM % 10))
--collision_rew 5 \
--n_training_threads 1 --n_rollout_threads 128 \
--num_mini_batch 1 \
--episode_length 25 \
--num_env_steps 2000000 \
--ppo_epoch 10 --use_ReLU --gain 0.01 --lr 7e-4 --critic_lr 7e-4 \
--user_name "marl" \
--use_cent_obs "False" \
--graph_feat_type "relative" \
--auto_mini_batch_size --target_mini_batch_size 128
I hope this answers your question.
Thanks! Sid
Hi @QMBX,
You can use the
--num_obstacles
flag.We created a test submission bash script where we randomly chose the number of obstacles.
python -u onpolicy/scripts/train_mpe.py --use_valuenorm --use_popart \ --project_name "informarl" \ --env_name "GraphMPE" \ --algorithm_name "rmappo" \ --seed 0 \ --experiment_name "informarl" \ --scenario_name "navigation_graph" \ --num_agents 3 \ --num_obstacles $((1 + $RANDOM % 10)) --collision_rew 5 \ --n_training_threads 1 --n_rollout_threads 128 \ --num_mini_batch 1 \ --episode_length 25 \ --num_env_steps 2000000 \ --ppo_epoch 10 --use_ReLU --gain 0.01 --lr 7e-4 --critic_lr 7e-4 \ --user_name "marl" \ --use_cent_obs "False" \ --graph_feat_type "relative" \ --auto_mini_batch_size --target_mini_batch_size 128
I hope this answers your question.
Thanks! Sid
Thank you very much for your reply. I'm sorry to see your reply one day late. So when I try to reproduce the table in the Scalability of inforMARL section, do I need to run multiple experiments with the same number of agents, with different --num_obstacles Settings? And then evaluate these models and take an average? In addition, is it also the case that different '--num_obstacles' parameters are used in the assessment?
The table of Scalability of inforMARL refers to the following table
Oh, you meant those experiments! You don't need to train separate policies for different number of agents and obstacles. You can just train on $n$ agents and $d$ obstacles and test on $m\neq n$ agents and $f\neq d$ obstacles. You can execute the test script with:
python onpolicy/scripts/eval_mpe.py \
--model_dir=<add_file_path_to_saved_weights_folder> \
--render_episodes=1 \
--num_agents=3 \
--num_obstacles=$((1 + $RANDOM % 10)) \
--seed=1 \
--episode_length=25 \
--use_dones=False --collaborative=True \
--scenario_name='navigation_graph' --save_gifs --use_render
I hope this answers your question!
So that's it. Before this, I treated the setting of 0-10 obstacles as an environment setting during the training phase
Thank you very much for your help
Glad that it got resolved. Closing this issue now. Please re-open if the issue persists.
Thanks, Sid
Hello @nsidn98 , Thank you for open-source the paper code. When I tried to run the program to reproduce the experimental results in the paper, I couldn't find part of the experimental code for Scalablity. The setting of the experiment is mentioned in the paper:
But I can't find anything in the code about random obstacles Can you tell me how to set it up, or what parts of the code I should change? Thank you