AnujMahajanOxf / MAVEN

Submission for MAVEN: Multi-Agent Variational Exploration
57 stars 21 forks source link

How can I reproduce the results of Corridor #6

Closed GoingMyWay closed 4 years ago

GoingMyWay commented 4 years ago

Hi, which config and command should I use to train the results of Corridor?

GoingMyWay commented 4 years ago

@AnujMahajanOxf Hi, could you please help me reproduce the result of the corridor scenario? Which config should I run?

I tried this, but it reported no such env, why?

$ python src/main.py --config=noisemix_smac --env-config=sc2 with env_args.map_name=corridor t_max=10000000                                                                                                  

pygame 1.9.4
Hello from the pygame community. https://www.pygame.org/contribute.html
[DEBUG 00:51:59] git.cmd Popen(['git', 'diff', '--cached', '--abbrev=40', '--full-index', '--raw'], cwd=/home/me/Projects/dist_marl/MAVEN, universal_newlines=False, shell=None, istream=None)
[DEBUG 00:51:59] git.cmd Popen(['git', 'diff', '--abbrev=40', '--full-index', '--raw'], cwd=/home/me/Projects/dist_marl/MAVEN, universal_newlines=False, shell=None, istream=None)
[DEBUG 00:51:59] git.cmd Popen(['git', 'cat-file', '--batch-check'], cwd=/home/me/Projects/dist_marl/MAVEN, universal_newlines=False, shell=None, istream=<valid stream>)
[DEBUG 00:51:59] git.cmd Popen(['git', 'diff', '--cached', '--abbrev=40', '--full-index', '--raw'], cwd=/home/me/Projects/dist_marl/MAVEN, universal_newlines=False, shell=None, istream=None)
[DEBUG 00:51:59] git.cmd Popen(['git', 'diff', '--abbrev=40', '--full-index', '--raw'], cwd=/home/me/Projects//MAVEN, universal_newlines=False, shell=None, istream=None)
[DEBUG 00:51:59] git.cmd Popen(['git', 'cat-file', '--batch-check'], cwd=/home/me/Projects/MAVEN, universal_newlines=False, shell=None, istream=<valid stream>)
[DEBUG 00:51:59] git.cmd Popen(['git', 'diff', '--cached', '--abbrev=40', '--full-index', '--raw'], cwd=/home/me/Projects/MAVEN, universal_newlines=False, shell=None, istream=None)
[DEBUG 00:51:59] git.cmd Popen(['git', 'diff', '--abbrev=40', '--full-index', '--raw'], cwd=/home/me/Projects/MAVEN, universal_newlines=False, shell=None, istream=None)
[DEBUG 00:51:59] git.cmd Popen(['git', 'cat-file', '--batch-check'], cwd=/home/me/Projects/MAVEN, universal_newlines=False, shell=None, istream=<valid stream>)
[INFO 00:51:59] root Saving to FileStorageObserver in results/sacred.
[DEBUG 00:52:00] pymarl Using capture mode "fd"
[INFO 00:52:00] pymarl Running command 'my_main'
[INFO 00:52:00] pymarl Started run with ID "9"
[DEBUG 00:52:00] pymarl Starting Heartbeat
[DEBUG 00:52:00] my_main Started
[INFO 00:52:00] my_main Experiment Parameters:
[INFO 00:52:00] my_main 

{   'action_selector': 'epsilon_greedy',
    'agent': 'noise_rnn',
    'agent_output_type': 'q',
    'bandit_batch': 64,
    'bandit_buffer': 512,
    'bandit_epsilon': 0.1,
    'bandit_iters': 8,
    'bandit_policy': True,
    'bandit_reward_scaling': 20,
    'bandit_use_state': True,
    'batch_size': 32,
    'batch_size_run': 8,
    'buffer_cpu_only': False,
    'buffer_size': 5000,
    'checkpoint_path': '',
    'critic_lr': 0.0005,
    'discrim_layers': 3,
    'discrim_size': 64,
    'double_q': True,
    'entropy_scaling': 0.001,
    'env': 'sc2',
    'env_args': {   'bunker_enter_range': 5,
                    'continuing_episode': False,
                    'difficulty': '7',
                    'game_version': '4.1.2',
                    'heuristic': False,
                    'map_name': 'corridor',
                    'move_amount': 2,
                    'obs_all_health': True,
                    'obs_instead_of_state': False,
                    'obs_last_action': False,
                    'obs_own_health': True,
                    'obs_pathing_grid': False,
                    'obs_terrain_height': False,
                    'restrict_actions': True,
                    'reward_death_value': 10,
                    'reward_defeat': 0,
                    'reward_negative_scale': 0.5,
                    'reward_only_positive': True,
                    'reward_scale': True,
                    'reward_scale_rate': 20,
                    'reward_sparse': False,
                    'reward_win': 200,
                    'save_replay_prefix': '',
                    'seed': 57018984,
                    'state_last_action': True,
                    'step_mul': 8},
    'epsilon_anneal_time': 50000,
    'epsilon_finish': 0.05,
    'epsilon_start': 1.0,
    'evaluate': False,
    'gamma': 0.99,
    'grad_norm_clip': 10,
    'hard_qs': False,
    'hyper_initialization_nonzeros': 0,
    'label': 'default_label',
    'learner': 'noise_q_learner',
    'learner_log_interval': 2000,
    'load_step': 0,
    'local_results_path': 'results',
    'log_interval': 2000,
    'lr': 0.0005,
    'mac': 'noise_mac',
    'mi_intrinsic': False,
    'mi_loss': 1,
    'mi_scaler': 0.1,
    'mixer': 'qmix',
    'mixing_embed_dim': 32,
    'name': 'noisemix_smac_parallel',
    'noise_bandit': False,
    'noise_bandit_epsilon': 0.2,
    'noise_bandit_lr': 0.1,
    'noise_dim': 2,
    'noise_embedding_dim': 32,
    'obs_agent_id': True,
    'obs_last_action': True,
    'optim_alpha': 0.99,
    'optim_eps': 1e-05,
    'recurrent_critic': False,
    'repeat_id': 1,
    'rnn_agg_size': 32,
    'rnn_discrim': False,
    'rnn_hidden_dim': 64,
    'runner': 'parallel',
    'runner_log_interval': 2000,
    'save_model': False,
    'save_model_interval': 5000,
    'save_replay': False,
    'seed': 57018984,
    'skip_connections': False,
    't_max': 10000000,
    'target_update_interval': 200,
    'test_greedy': True,
    'test_interval': 10000,
    'test_nepisode': 32,
    'use_cuda': True,
    'use_tensorboard': False}

Process Process-1:
Traceback (most recent call last):
  File "/home/me/miniconda3/envs/sc/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/me/miniconda3/envs/sc/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/me/Projects/MAVEN/maven_code/src/runners/parallel_runner.py", line 311, in env_worker
    env = env_fn.x()
  File "/home/me/Projects/MAVEN/maven_code/src/envs/__init__.py", line 7, in env_fn
    return env(**kwargs)
  File "/home/me/Projects/MAVEN/maven_code/src/envs/starcraft2/starcraft2.py", line 77, in __init__
    "map {} not in map registry! please add.".format(self.map_name)
AssertionError: map corridor not in map registry! please add.
Process Process-2:
Traceback (most recent call last):
  File "/home/me/miniconda3/envs/sc/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/me/miniconda3/envs/sc/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/me/Projects/MAVEN/maven_code/src/runners/parallel_runner.py", line 311, in env_worker
    env = env_fn.x()
  File "/home/me/Projects/MAVEN/maven_code/src/envs/__init__.py", line 7, in env_fn
    return env(**kwargs)
  File "/home/me/Projects/MAVEN/maven_code/src/envs/starcraft2/starcraft2.py", line 77, in __init__
    "map {} not in map registry! please add.".format(self.map_name)
AssertionError: map corridor not in map registry! please add.
Process Process-3:
Traceback (most recent call last):
  File "/home/me/miniconda3/envs/sc/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/me/miniconda3/envs/sc/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/me/Projects/MAVEN/maven_code/src/runners/parallel_runner.py", line 311, in env_worker
    env = env_fn.x()
  File "/home/me/Projects/MAVEN/maven_code/src/envs/__init__.py", line 7, in env_fn
    return env(**kwargs)
  File "/home/me/Projects/MAVEN/maven_code/src/envs/starcraft2/starcraft2.py", line 77, in __init__
    "map {} not in map registry! please add.".format(self.map_name)
AssertionError: map corridor not in map registry! please add.
Process Process-4:
Traceback (most recent call last):
  File "/home/me/miniconda3/envs/sc/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/me/miniconda3/envs/sc/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/me/Projects//MAVEN/maven_code/src/runners/parallel_runner.py", line 311, in env_worker
    env = env_fn.x()
  File "/home/me/Projects//MAVEN/maven_code/src/envs/__init__.py", line 7, in env_fn
    return env(**kwargs)
  File "/home/meProjects//MAVEN/maven_code/src/envs/starcraft2/starcraft2.py", line 77, in __init__
    "map {} not in map registry! please add.".format(self.map_name)
AssertionError: map corridor not in map registry! please add.
...
j3soon commented 4 years ago

Maybe try running the maps that are listed here?

Or you can copy the default maps in SMAC into StarCraftII directory, but you may need to modify map_params.py.

GoingMyWay commented 4 years ago

Maybe try running the maps that are listed here?

Or you can copy the default maps in SMAC into StarCraftII directory, but you may need to modify map_params.py.

I found it, is 2_corridor the same as the original corridor from SMAC? I found the corridor has been trained in your paper. So, the 2_corridor is a new env with 2 corridors right?

Could you please provide commands to reproduce the results of corridor? I found parallel runner was used in the config file, how can I set the episode? I found simply setting runner=episode returns error.

Traceback (most recent calls WITHOUT Sacred internals):
  File "src/main.py", line 37, in my_main
    run(_run, _config, _log)
  File "/home/me/MAVEN/maven_code/src/run.py", line 48, in run
    run_sequential(args=args, logger=logger)
  File "/home/me/MAVEN/maven_code/src/run.py", line 79, in run_sequential
    runner = r_REGISTRY[args.runner](args=args, logger=logger)
  File "/home/me/MAVEN/maven_code/src/runners/episode_runner.py", line 13, in __init__
    assert self.batch_size == 1
AssertionError
GoingMyWay commented 4 years ago

I tried to reproduce the result of corridor

python src/main.py --config=noisemix_episode --env-config=sc2 with env_args.map_name=corridor runner=episode batch_size_run=1 t_max=10000000

But I got

Traceback (most recent calls WITHOUT Sacred internals):
  File "src/main.py", line 37, in my_main
    run(_run, _config, _log)
  File "/home/me/MAVEN/maven_code/src/run.py", line 48, in run
    run_sequential(args=args, logger=logger)
  File "/home/me/MAVEN/maven_code/src/run.py", line 79, in run_sequential
    runner = r_REGISTRY[args.runner](args=args, logger=logger)
  File "/home/me/MAVEN/maven_code/src/runners/episode_runner.py", line 15, in __init__
    self.env = env_REGISTRY[self.args.env](env_args=self.args.env_args, args=args)
  File "/home/me/MAVEN/maven_code/src/envs/__init__.py", line 7, in env_fn
    return env(**kwargs)
  File "/home/me/MAVEN/maven_code/src/envs/starcraft2/starcraft2.py", line 142, in __init__
    self._launch()
  File "/home/me/MAVEN/maven_code/src/envs/starcraft2/starcraft2.py", line 206, in _launch
    self._sc2_proc = self._run_config.start(version=self.game_version, window_size=self.window_size)
  File "/home/me/miniconda3/lib/python3.7/site-packages/pysc2/run_configs/platforms.py", line 205, in start
    want_rgb=want_rgb, extra_args=extra_args, **kwargs)
  File "/home/me/miniconda3/lib/python3.7/site-packages/pysc2/run_configs/platforms.py", line 88, in start
    self, exec_path=exec_path, version=self.version, **kwargs)
TypeError: type object got multiple values for keyword argument 'version'

How can I resolve it? I set export SC2PATH=/home/me/StarCraftII, and changed MAVEN/maven_code/src/envs/starcraft2/starcraft2.py to get the path of SC2PATH from /home/me/StarCraftII.

GoingMyWay commented 4 years ago

After using code from SMAC and updated some code, I can run the code.

@AnujMahajanOxf Dear sir, I found in your paper, the corridor ran 8M steps. Did it use the parallel runner or the episode runner?

AnujMahajanOxf commented 4 years ago

Hi,

Yes, corridor corresponds to micro_corridor, 2_corridors is the new map we introduce to test adaptation for MAVEN.

We used parallel_runner, the corresponding hyper-parameter is batch_size_run which was set to 1 for the experiments.

GoingMyWay commented 4 years ago

Hi,

Yes, corridor corresponds to micro_corridor, 2_corridors is the new map we introduce to test adaptation for MAVEN.

We used parallel_runner, the corresponding hyper-parameter is batch_size_run which was set to 1 for the experiments.

Thanks, for batch_size_run=1 in your parallel_runner, I think it exactly same as episode learner right? Because batch_size_run=1?

I used 2080 TI GPU and found running 8M steps will take 3 days, in your paper, I found it takes 36 hours, how can I speed it up?