oxwhirl / pymarl

Python Multi-Agent Reinforcement Learning framework
Apache License 2.0
1.89k stars 387 forks source link

running basic scenario in readme #23

Closed esquires closed 5 years ago

esquires commented 5 years ago

When I run the training scenario from the README:

python3 src/main.py --config=qmix_smac --env-config=sc2 with env_args.map_name=2s3z

I get the following output:

Traceback (most recent calls WITHOUT Sacred internals):                                                                                                                                          
  File "src/main.py", line 34, in my_main                                                                                                                                                        
    run(_run, _config, _log)                                                                                                                                                                     
  File "/home/esquires/repos/rl/rllib/pymarl/src/run.py", line 48, in run                                                                                                                        
    run_sequential(args=args, logger=logger)                                                                                                                                                     
  File "/home/esquires/repos/rl/rllib/pymarl/src/run.py", line 179, in run_sequential                                                                                                            
    learner.train(episode_sample, runner.t_env, episode)                                                                                                                                         
  File "/home/esquires/repos/rl/rllib/pymarl/src/learners/q_learner.py", line 100, in train                                                                                                      
    loss.backward()                                                                                                                                                                              
  File "/home/esquires/venvs/rllib/lib/python3.6/site-packages/torch/tensor.py", line 107, in backward                                                                                           
    torch.autograd.backward(self, gradient, retain_graph, create_graph)                                                                                                                          
  File "/home/esquires/venvs/rllib/lib/python3.6/site-packages/torch/autograd/__init__.py", line 93, in backward                                                                                 
    allow_unreachable=True)  # allow_unreachable flag                                                                                                                                            
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [32, 94, 5, 11]], which is output 0 of SliceBackward, is at
 version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True). 

After setting torch.autograd.set_detect_anomaly(True) in the initializer of the QLearner object I get the following:

sys:1: RuntimeWarning: Traceback of forward call that caused the error:                                                                                                                          
  File "src/main.py", line 88, in <module>                                                                                                                                                       
    ex.run_commandline(params)                                                                                                                                                                   
  File "/home/esquires/repos/rl/rllib/sacred/sacred/experiment.py", line 250, in run_commandline                                                                                                 
    return self.run(cmd_name, config_updates, named_configs, {}, args)                                                                                                                           
  File "/home/esquires/repos/rl/rllib/sacred/sacred/experiment.py", line 199, in run                                                                                                             
    run()                                                                                                                                                                                        
  File "/home/esquires/repos/rl/rllib/sacred/sacred/run.py", line 229, in __call__                                                                                                               
    self.result = self.main_function(*args)                                                                                                                                                      
  File "/home/esquires/repos/rl/rllib/sacred/sacred/config/captured_function.py", line 48, in captured_function                                                                                  
    result = wrapped(*args, **kwargs)                                                                                                                                                            
  File "src/main.py", line 34, in my_main                                                                                                                                                        
    run(_run, _config, _log)                                                                                                                                                                     
  File "/home/esquires/repos/rl/rllib/pymarl/src/run.py", line 48, in run                                                                                                                        
    run_sequential(args=args, logger=logger)                                                                                                                                                     
  File "/home/esquires/repos/rl/rllib/pymarl/src/run.py", line 179, in run_sequential                                                                                                            
    learner.train(episode_sample, runner.t_env, episode)                                                                                                                                         
  File "/home/esquires/repos/rl/rllib/pymarl/src/learners/q_learner.py", line 56, in train                                                                                                       
    chosen_action_qvals = th.gather(mac_out[:, :-1], dim=3, index=actions).squeeze(3)  # Remove the last dim                                                                                     

[ERROR 17:35:55] pymarl Failed after 0:00:17!                                                                                                                                                    
Traceback (most recent calls WITHOUT Sacred internals):                                                                                                                                          
  File "src/main.py", line 34, in my_main                                                                                                                                                        
    run(_run, _config, _log)                                                                                                                                                                     
  File "/home/esquires/repos/rl/rllib/pymarl/src/run.py", line 48, in run                                                                                                                        
    run_sequential(args=args, logger=logger)                                                                                                                                                     
  File "/home/esquires/repos/rl/rllib/pymarl/src/run.py", line 179, in run_sequential                                                                                                            
    learner.train(episode_sample, runner.t_env, episode)                                                                                                                                         
  File "/home/esquires/repos/rl/rllib/pymarl/src/learners/q_learner.py", line 101, in train                                                                                                      
    loss.backward()                                                                                                                                                                              
  File "/home/esquires/venvs/rllib/lib/python3.6/site-packages/torch/tensor.py", line 107, in backward                                                                                           
    torch.autograd.backward(self, gradient, retain_graph, create_graph)                                                                                                                          
  File "/home/esquires/venvs/rllib/lib/python3.6/site-packages/torch/autograd/__init__.py", line 93, in backward                                                                                 
    allow_unreachable=True)  # allow_unreachable flag                                                                                                                                            
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [32, 71, 5, 11]], which is output 0 of SliceBackward, is at
 version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere late
r. Good luck!

I then broke-up the line so that the gather and squeeze functions are on separate lines and the code complains about the gather line.

I am on 33070bc for pymarl, cdfff9b for smac, and version 1.1.0 for pytorch.

Should I be running a different command than what is in the README?

Thanks.

Greg-Farquhar commented 5 years ago

You should be good with pytorch version 0.4.1 You can check the requirements.txt for all recommended package versions. If I get some time I'll look into what's changed and see if pytorch 1.x support isn't too hard.

esquires commented 5 years ago

Thanks! That seems to have fixed it. Feel free to re-open if you want to investigate pytorch 1.x but this resolves what I was having difficulty with.