Closed ybh4798 closed 3 years ago
Hi,
Thank you for your interest. Because of the inherent difference across different baseline approaches and their training procedures, it is hard to juxtapose them. So we followed the training setting from LIIR and aligned batch size, the number of batch updates, and the environment steps for all approaches. Please see "Training setting" of Section 4.2 for more details.
In the experiments of LICA, env steps are set to be 32 or 64 million steps. In the experiments of QMIX, env steps are set to be 1 to 3 million steps. Why the env steps are quite different? I don't know how to interpret the difference of env steps. Could you please tell me why?