Open shengqie opened 9 months ago
I would like to add one more thing to you. After converting the action space to a continuous action space, the algorithm encountered an error after iterating thousands of times:
向您补充一点,在我将动作空间转换为连续动作空间后,算法迭代数千次以后出现了报错:
Failure # 1 (occurred at 2024-01-26_02-47-08) Traceback (most recent call last): File "/home/user/miniconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 890, in _process_trial results = self.trial_executor.fetch_result(trial) File "/home/user/miniconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/ray_trial_executor.py", line 788, in fetch_result result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT) File "/home/user/miniconda3/envs/marllib/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper return func(*args, **kwargs) File "/home/user/miniconda3/envs/marllib/lib/python3.8/site-packages/ray/worker.py", line 1625, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(ValueError): [36mray::IPPOTrainer.train_buffered()[39m (pid=1138520, ip=10.31.22.121, repr=IPPOTrainer) File "/home/user/miniconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/ppo/ppo_torch_policy.py", line 46, in ppo_surrogate_loss curr_action_dist = dist_class(logits, model) File "/home/user/miniconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/models/torch/torch_action_dist.py", line 186, in init self.dist = torch.distributions.normal.Normal(mean, torch.exp(log_std)) File "/home/user/miniconda3/envs/marllib/lib/python3.8/site-packages/torch/distributions/normal.py", line 50, in init super(Normal, self).init(batch_shape, validate_args=validate_args) File "/home/user/miniconda3/envs/marllib/lib/python3.8/site-packages/torch/distributions/distribution.py", line 53, in init raise ValueError("The parameter {} has invalid values".format(param)) ValueError: The parameter loc has invalid values