Use RNN in MARL - Githubissues

zhangwenjun1229 commented 10 months ago

[ ] I have marked all applicable categories:
- [ ] exception-raising bug
- [ ] RL algorithm bug
- [ ] documentation request (i.e. "X is missing from the documentation.")
- [ ] new feature request
[ ] I have visited the source website
[x] I have searched through the issue tracker for duplicates
[ ] I have mentioned version numbers, operating system and environment, where applicable:
```
import tianshou, gymnasium as gym, torch, numpy, sys
print(tianshou.__version__, gym.__version__, torch.__version__, numpy.__version__, sys.version, sys.platform)
```
I try to implement a MARL based on tianshou and I have done it by using a DQN as the policy of each agent. But when I come to change DQN to RNN, I failed and I got this: Traceback (most recent call last): File "main.py", line 363, in result, agent1, agent2 = train_agent(args) File "main.py", line 279, in train_agent train_collector.collect(n_step=args.batch_size args.training_num) File "/lib/python3.8/site-packages/tianshou/data/collector.py", line 279, in collect result = self.policy(self.data, last_state) File "/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/lib/python3.8/site-packages/tianshou/policy/multiagent/mapolicy.py", line 145, in forward out = policy( File "/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/lib/python3.8/site-packages/tianshou/policy/modelfree/dqn.py", line 160, in forward logits, hidden = model(obs_next, state=state, info=batch.info) File "lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "lib/python3.8/site-packages/tianshou/utils/net/common.py", line 320, in forward state["hidden"].transpose(0, 1).contiguous(), File "/lib/python3.8/site-packages/tianshou/data/batch.py", line 242, in getitem return self.dict[index] KeyError: 'hidden'

I found it may be caused by the input "state", which i have not defined. But I check the given test_drqn.py and i can't find how to use "state". Actually, I just want to stack obs in each step. I thought the default state is obs. Could you please give me some instructions on how to fix this error or achieve my goal? Thanks

Trinkle23897 commented 10 months ago

could you refer the drqn example? https://github.com/thu-ml/tianshou/blob/master/test/discrete/test_drqn.py

https://github.com/thu-ml/tianshou/blob/66b7fc542b496090e83d2df4a846fc02f3f3167b/test/discrete/test_drqn.py#L67-L69 This is the major change -- you need a different network

zhangwenjun1229 commented 10 months ago

Yes, I have refered the drqn example and I the Recurrent() net before. And then, I got the error that I mentioned before.

zhangwenjun1229 commented 10 months ago

I think it can be resulted from MARL. When I check the variables, I found that the Recurrent() have defined "hidden" and "cell" attributes for the first agent but not for the second agent. This error could happen when the model refer to the "hidden" attribute of the second agent. However, there is no such attribute of it (In fact, it' s Batch()). Then I check the code of this part. It seems only when the state is None, the algorithm will define these attributes.

zhangwenjun1229 commented 10 months ago

Follow by this, I hack the code in the Recurrent() object in "state” part. In particular, I change line 313 from "if state is None:" to "if state is None or isinstance(state,Batch) and state.is_empty():". I 'm not very sure if this is the key reason for my error. But now it work!

lsylusiyao commented 10 months ago

I've got a similar problem, too. The stack_num makes my action_mask for MARL broken because I don't know which action_mask I should choose. For example, I use the DRQN and the debug info like this:

thu-ml / tianshou

Use RNN in MARL #965