thu-ml / tianshou

An elegant PyTorch deep reinforcement learning library.
https://tianshou.org
MIT License
7.76k stars 1.12k forks source link

Wrong output of forward for custom policy #1029

Open hazel260802 opened 7 months ago

hazel260802 commented 7 months ago

Traceback (most recent call last): File "/home/ad/mec_morl_multipolicy/train.py", line 210, in train_collector.collect(n_episode=train_num) File "/home/ad/.local/lib/python3.10/site-packages/tianshou/data/collector.py", line 279, in collect result = self.policy(self.data, last_state) File "/home/ad/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/ad/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, **kwargs) File "/home/ad/.local/lib/python3.10/site-packages/tianshou/policy/modelfree/pg.py", line 124, in forward dist = self.dist_fn(logits) File "/home/ad/.local/lib/python3.10/site-packages/torch/distributions/categorical.py", line 57, in init if probs.dim() < 1: File "/home/ad/.local/lib/python3.10/site-packages/tianshou/data/batch.py", line 213, in getattr return getattr(self.dict, key) AttributeError: 'dict' object has no attribute 'dim'

MischaPanch commented 7 months ago

The problem is the output of the Actor - in your case it will be passed to torch's Categorical distribution. Something is going wrong in the forward. I can't spot the error immediately - could you put a debugger at the init of Categorical and report what is being passed as probs?

Btw, debugging in such a way could help you solve this :)