hijkzzz / pymarl2

Fine-tuned MARL algorithms on SMAC (100% win rates on most scenarios)
https://iclr-blogposts.github.io/2023/blog/2023/riit/
Apache License 2.0
632 stars 124 forks source link

VMIX算法报NAN #13

Closed xihuai18 closed 2 years ago

xihuai18 commented 2 years ago
Traceback (most recent call last):                                                                                                                  [665/3388]
  File "/home/xhwang/anaconda3/envs/pymarl/lib/python3.8/site-packages/sacred/experiment.py", line 312, in run_commandline                                    
    return self.run(                                                                                                                                          
  File "/home/xhwang/anaconda3/envs/pymarl/lib/python3.8/site-packages/sacred/experiment.py", line 276, in run                                                
    run()                                                                                                                                                     
  File "/home/xhwang/anaconda3/envs/pymarl/lib/python3.8/site-packages/sacred/run.py", line 238, in __call__                                                  
    self.result = self.main_function(*args)                                                                                                                   
  File "/home/xhwang/anaconda3/envs/pymarl/lib/python3.8/site-packages/sacred/config/captured_function.py", line 42, in captured_function                     
    result = wrapped(*args, **kwargs)                                                                                                                         
  File "src/main.py", line 38, in my_main                                                                                                                     
    run_REGISTRY[_config['run']](_run, config, _log)                                                                                                          
  File "/NAS2020/Workspaces/DRLGroup/xhwang/Lab/SCII/pymarl2/src/run/run.py", line 54, in run                                                                 
    run_sequential(args=args, logger=logger)                                                                                                                  
  File "/NAS2020/Workspaces/DRLGroup/xhwang/Lab/SCII/pymarl2/src/run/run.py", line 195, in run_sequential                                                     
    learner.train(episode_sample, runner.t_env, episode)                                                                                                      
  File "/NAS2020/Workspaces/DRLGroup/xhwang/Lab/SCII/pymarl2/src/learners/policy_gradient_v2.py", line 58, in train                      
    advantages, td_error, targets_taken, log_pi_taken, entropy = self._calculate_advs(batch, rewards, terminated, actions, avail_actions,                     
  File "/NAS2020/Workspaces/DRLGroup/xhwang/Lab/SCII/pymarl2/src/learners/policy_gradient_v2.py", line 115, in _calculate_advs                                
    entropy = categorical_entropy(pi).reshape(-1)  #[bs, t, n_agents, 1]                                                                                      
  File "/NAS2020/Workspaces/DRLGroup/xhwang/Lab/SCII/pymarl2/src/components/action_selectors.py", line 110, in categorical_entropy                            
    return Categorical(probs=probs).entropy()                                                                                                                 
  File "/home/xhwang/anaconda3/envs/pymarl/lib/python3.8/site-packages/torch/distributions/categorical.py", line 64, in __init__                              
    super(Categorical, self).__init__(batch_shape, validate_args=validate_args)                                                  
  File "/home/xhwang/anaconda3/envs/pymarl/lib/python3.8/site-packages/torch/distributions/distribution.py", line 55, in __init__
    raise ValueError(                                                                                                                                         
ValueError: Expected parameter probs (Tensor of shape (8, 54, 10, 18)) of distribution Categorical(probs: torch.Size([8, 54, 10, 18])) to satisfy the constrai
nt Simplex(), but found invalid values:

后面一截是数据没有贴上来,问题就是里面有nan

xihuai18 commented 2 years ago

启动的指令是

python3 src/main.py --config=vmix --env-config=sc2 with env_args.map_name=MMM2
hijkzzz commented 2 years ago

已经修复 应该是 pytorch 升级导致