ValueError: Expected parameter probs (Tensor of shape (1, 100)) of distribution Categorical(probs: torch.Size([1, 100])) to satisfy the constraint Simplex(), but found invalid values:

lpj12121 commented 4 months ago

When I tried to train the lower-level agent on a new computer, the following problem occurred:

Traceback (most recent call last): File "D:\hrl-acra-main\main.py", line 29, in run(config) File "D:\hrl-acra-main\main.py", line 16, in run scenario.run() File "D:\hrl-acra-main\base\scenario.py", line 85, in run self.ready() File "D:\hrl-acra-main\base\scenario.py", line 73, in ready self.solver.learn(self.env, num_epochs=self.config.num_train_epochs) File "D:\hrl-acra-main\solver\learning\rl_solver.py", line 201, in learn solution = self.learn_with_instance(instance, revenue2cost_list, epoch_logprobs) File "D:\hrl-acra-main\solver\learning\rl_solver.py", line 154, in learn_with_instance action, action_logprob = self.select_action(tensor_sub_obs, mask=mask, sample=True) File "D:\hrl-acra-main\solver\learning\rl_solver.py", line 381, in select_action candicate_action_dist = Categorical(probs=candicate_action_probs) File "D:\software\anaconda3\envs\fuxian\lib\site-packages\torch\distributions\categorical.py", line 66, in init super(Categorical, self).init(batch_shape, validate_args=validate_args) File "D:\software\anaconda3\envs\fuxian\lib\site-packages\torch\distributions\distribution.py", line 56, in init raise ValueError( ValueError: Expected parameter probs (Tensor of shape (1, 100)) of distribution Categorical(probs: torch.Size([1, 100])) to satisfy the constraint Simplex(), but found invalid values: tensor([[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]], device='cuda:0')

I tried to solve the error by adding a function "torch.distributions.constraints.simplex.check(candicate_actions_probs) " under "candicate_actions_probs = F.softmax(candicate_actions_logits, dim=-1)" in the solver/learning/rl.solver.py file.

But what puzzles me is that the original code will report an error most of the time, and after adding this command, it will still report an error, but the probability has greatly decreased. This makes me at a loss as to how to deal with it.

lpj12121 commented 4 months ago

There may be errors when training the upper-level intelligent agent. I will try to train the upper-level intelligent agent after running this.

lpj12121 commented 4 months ago

Sometimes this error occurs

File "D:\hrl-acra-main\main.py", line 29, in run(config) File "D:\hrl-acra-main\main.py", line 16, in run scenario.run() File "D:\hrl-acra-main\base\scenario.py", line 85, in run self.ready() File "D:\hrl-acra-main\base\scenario.py", line 73, in ready self.solver.learn(self.env, num_epochs=self.config.num_train_epochs) loss = self.update() File "D:\hrl-acra-main\solver\learning\rl_solver.py", line 645, in update values, action_logprobs, dist_entropy, other = self.evaluate_actions(observations, actions, masks=masks, return_others=True) File "D:\hrl-acra-main\solver\learning\rl_solver.py", line 411, in evaluate_actions dist = Categorical(actions_probs) File "D:\software\anaconda3\envs\lpj-pytorch-py310\lib\site-packages\torch\distributions\categorical.py", line 66, in init super(Categorical, self).init(batch_shape, validate_args=validate_args) File "D:\software\anaconda3\envs\lpj-pytorch-py310\lib\site-packages\torch\distributions\distribution.py", line 56, in init raise ValueError( ValueError: Expected parameter probs (Tensor of shape (256, 100)) of distribution Categorical(probs: torch.Size([256, 100])) to satisfy the constraint Simplex(), but found invalid values:
tensor([[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], device='cuda:0', grad_fn=)

This is the same problem as above, and the solution is the same. Is it a problem with the version of the software package I installed?

neflibata688 commented 3 months ago

Do you have WeChat? We can add a friend to communicate with you

GeminiLight commented 3 months ago

Sorry for the delayed response. I'm currently busy with an upcoming event deadline. We can communicate more efficiently on WeChat. My WeChat ID is wtfly2018.

GeminiLight / hrl-acra

ValueError: Expected parameter probs (Tensor of shape (1, 100)) of distribution Categorical(probs: torch.Size([1, 100])) to satisfy the constraint Simplex(), but found invalid values: #4