YiqinYang / ICQ

Codes accompanying the paper "Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning" (NeurIPS 2021 Spotlight https://arxiv.org/abs/2106.03400)
70 stars 8 forks source link

ICQ-MA hangs indefinetly #7

Closed jcformanek closed 2 years ago

jcformanek commented 2 years ago

Good day, thank you for open sourcing your code. I am very excited to experiment with it.

I ran the following:

python3 src/main.py --config=offpg_smac --env-config=sc2 with env_args.map_name=3s_vs_3z

The code seemed to start but then hangs indefinitely. I would really appreciate if you could help me get the code running.

This is the only output I get in the terminal:

INFO - pymarl - Started run with ID "2"
INFO - my_main - Experiment Parameters:
INFO - my_main - 

{   'action_selector': 'multinomial',
    'agent': 'rnn',
    'agent_output_type': 'pi_logits',
    'batch_size': 16,
    'batch_size_run': 10,
    'buffer_cpu_only': True,
    'buffer_size': 32,
    'checkpoint_path': '',
    'critic_baseline_fn': 'coma',
    'critic_lr': 0.0001,
    'critic_q_fn': 'coma',
    'critic_train_mode': 'seq',
    'critic_train_reps': 1,
    'env': 'sc2',
    'env_args': {   'continuing_episode': False,
                    'debug': False,
                    'difficulty': '7',
                    'game_version': None,
                    'map_name': '3s_vs_3z',
                    'move_amount': 2,
                    'obs_all_health': True,
                    'obs_instead_of_state': False,
                    'obs_last_action': False,
                    'obs_own_health': True,
                    'obs_pathing_grid': False,
                    'obs_terrain_height': False,
                    'obs_timestep_number': False,
                    'replay_dir': '',
                    'replay_prefix': '',
                    'reward_death_value': 10,
                    'reward_defeat': 0,
                    'reward_negative_scale': 0.5,
                    'reward_only_positive': True,
                    'reward_scale': True,
                    'reward_scale_rate': 20,
                    'reward_sparse': False,
                    'reward_win': 200,
                    'seed': 972368347,
                    'state_last_action': False,
                    'state_timestep_number': False,
                    'step_mul': 8},
    'epsilon_anneal_time': 500000,
    'epsilon_finish': 0.05,
    'epsilon_start': 0.5,
    'evaluate': False,
    'gamma': 0.99,
    'grad_norm_clip': 20,
    'label': 'default_label',
    'learner': 'offpg_learner',
    'learner_log_interval': 20000,
    'load_step': 0,
    'local_results_path': 'results',
    'log_interval': 20000,
    'lr': 0.0005,
    'mac': 'basic_mac',
    'mask_before_softmax': False,
    'mixing_embed_dim': 32,
    'name': 'offpg_smac',
    'obs_agent_id': True,
    'obs_last_action': True,
    'off_batch_size': 32,
    'off_buffer_size': 70000,
    'optim_alpha': 0.99,
    'optim_eps': 1e-05,
    'q_nstep': 0,
    'repeat_id': 1,
    'rnn_hidden_dim': 64,
    'runner': 'parallel',
    'runner_log_interval': 20000,
    'save_model': True,
    'save_model_interval': 1000000,
    'save_replay': False,
    'seed': 972368347,
    'step': 5,
    't_max': 10050000,
    'target_update_interval': 600,
    'tb_lambda': 0.93,
    'td_lambda': 0.8,
    'test_greedy': False,
    'test_interval': 20000,
    'test_nepisode': 20,
    'use_cuda': True,
    'use_tensorboard': True}

When I interrupt the terminal with ctrl+c I get the following:

Process Process-10:
Process Process-3:
Process Process-5:
Process Process-7:
Process Process-2:
Process Process-8:
Process Process-9:
Process Process-1:
Process Process-6:
Process Process-4:
Traceback (most recent call last):
  File "/home/claude/miniconda3/envs/icq/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/home/claude/miniconda3/envs/icq/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/claude/Documents/ICQ/ICQ-MA/src/runners/parallel_runner.py", line 228, in env_worker
    cmd, data = remote.recv()
  File "/home/claude/miniconda3/envs/icq/lib/python3.6/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/home/claude/miniconda3/envs/icq/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/home/claude/miniconda3/envs/icq/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
KeyboardInterrupt

Traceback (most recent call last):
  File "src/main.py", line 96, in <module>
    ex.run_commandline(params)
  File "/home/claude/miniconda3/envs/icq/lib/python3.6/site-packages/sacred/experiment.py", line 250, in run_commandline
    return self.run(cmd_name, config_updates, named_configs, {}, args)
  File "/home/claude/miniconda3/envs/icq/lib/python3.6/site-packages/sacred/experiment.py", line 199, in run
    run()
  File "/home/claude/miniconda3/envs/icq/lib/python3.6/site-packages/sacred/run.py", line 229, in __call__
    self.result = self.main_function(*args)
  File "/home/claude/miniconda3/envs/icq/lib/python3.6/site-packages/sacred/config/captured_function.py", line 48, in captured_function
    result = wrapped(*args, **kwargs)
  File "src/main.py", line 34, in my_main
    run(_run, config, _log)
  File "/home/claude/Documents/ICQ/ICQ-MA/src/run.py", line 48, in run
    run_sequential(args=args, logger=logger)
  File "/home/claude/Documents/ICQ/ICQ-MA/src/run.py", line 101, in run_sequential
    learner.cuda()
  File "/home/claude/Documents/ICQ/ICQ-MA/src/learners/offpg_learner.py", line 195, in cuda
    self.mac.cuda()
  File "/home/claude/Documents/ICQ/ICQ-MA/src/controllers/basic_controller.py", line 72, in cuda
    self.agent.cuda()
  File "/home/claude/miniconda3/envs/icq/lib/python3.6/site-packages/torch/nn/modules/module.py", line 265, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/home/claude/miniconda3/envs/icq/lib/python3.6/site-packages/torch/nn/modules/module.py", line 193, in _apply
    module._apply(fn)
  File "/home/claude/miniconda3/envs/icq/lib/python3.6/site-packages/torch/nn/modules/module.py", line 199, in _apply
    param.data = fn(param.data)
  File "/home/claude/miniconda3/envs/icq/lib/python3.6/site-packages/torch/nn/modules/module.py", line 265, in <lambda>
    return self._apply(lambda t: t.cuda(device))
YiqinYang commented 2 years ago

Dear jcformanek:

Can you try set 'use_cuda' as False? The code I have uploaded maybe a cpu version.

I will fix it this bug later since some one also noted it recently.

Also, the dataset I have uploaded is a toy example. If you want to reproduce the result, I can send the dataset for you. Or you can make the SMAC dataset by yourself.

jcformanek commented 2 years ago

Good day,

Many thanks, that seems to have worked. The code is running now. I will reach out when I need the dataset, for now I am just stepping through the code to understand it properly.

While you are fixing the bugs, I would recommend also checking your requirements.txt file as well. There where some dependency issues I needed to resolve during my install.

Thanks for your rapid response.