Closed Metro1998 closed 1 year ago
Hiya, Thanks so much for your attention!
There is an extra dim for discrete action's one_hot since we need an arbitrary starting signal $a^{i_0}$ (in Figure 2 of our paper) to indicate the beginning of the action sequence and have to distinguish it from other actions. For example, for action space with size 3, [1, 0, 0, 0] is the starting signal; [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1] are three actions respectively.
Hoping it might serve you, Muning
I have understood right now, thanks for your reply.
Hi, First of all, thanks for this excellent job on MARL. I don't understand why the third dimension in the https://github.com/PKU-MARL/Multi-Agent-Transformer/blob/879dd89da7129dc8b4de24d4ac047d48d551a51a/mat/algorithms/utils/transformer_act.py#L8 should be set to (action_dim + 1) rather than action_dim, which is the dimension of discrete action's one_hot. Is there any specific consideration?