Closed lpj12121 closed 4 months ago
There may be errors when training the upper-level intelligent agent. I will try to train the upper-level intelligent agent after running this.
Sometimes this error occurs
File "D:\hrl-acra-main\main.py", line 29, in
tensor([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]], device='cuda:0',
grad_fn=
This is the same problem as above, and the solution is the same. Is it a problem with the version of the software package I installed?
Do you have WeChat? We can add a friend to communicate with you
Sorry for the delayed response. I'm currently busy with an upcoming event deadline. We can communicate more efficiently on WeChat. My WeChat ID is wtfly2018.
When I tried to train the lower-level agent on a new computer, the following problem occurred:
Traceback (most recent call last): File "D:\hrl-acra-main\main.py", line 29, in
run(config)
File "D:\hrl-acra-main\main.py", line 16, in run
scenario.run()
File "D:\hrl-acra-main\base\scenario.py", line 85, in run
self.ready()
File "D:\hrl-acra-main\base\scenario.py", line 73, in ready
self.solver.learn(self.env, num_epochs=self.config.num_train_epochs)
File "D:\hrl-acra-main\solver\learning\rl_solver.py", line 201, in learn
solution = self.learn_with_instance(instance, revenue2cost_list, epoch_logprobs)
File "D:\hrl-acra-main\solver\learning\rl_solver.py", line 154, in learn_with_instance
action, action_logprob = self.select_action(tensor_sub_obs, mask=mask, sample=True)
File "D:\hrl-acra-main\solver\learning\rl_solver.py", line 381, in select_action
candicate_action_dist = Categorical(probs=candicate_action_probs)
File "D:\software\anaconda3\envs\fuxian\lib\site-packages\torch\distributions\categorical.py", line 66, in init
super(Categorical, self).init(batch_shape, validate_args=validate_args)
File "D:\software\anaconda3\envs\fuxian\lib\site-packages\torch\distributions\distribution.py", line 56, in init
raise ValueError(
ValueError: Expected parameter probs (Tensor of shape (1, 100)) of distribution Categorical(probs: torch.Size([1, 100])) to satisfy the constraint Simplex(), but found invalid values:
tensor([[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan]], device='cuda:0')
I tried to solve the error by adding a function "torch.distributions.constraints.simplex.check(candicate_actions_probs) " under "candicate_actions_probs = F.softmax(candicate_actions_logits, dim=-1)" in the solver/learning/rl.solver.py file.
But what puzzles me is that the original code will report an error most of the time, and after adding this command, it will still report an error, but the probability has greatly decreased. This makes me at a loss as to how to deal with it.