Closed CJ103CJ closed 6 months ago
Thank you for your attention to our paper.
As described in our paper, for SMAC, we follow the official evaluation metric in [1], i.e., we run 32 test episodes without exploration to record the test win rate and report the median performance and the 25-75% percentiles across 5 seeds. For GRF, we similarly run 32 test episodes to obtain a win rate and report the average win rate and the variance across 5 seeds.
If you have any more questions, please feel free to follow up
[1] Samvelyan M, Rashid T, De Witt C S, et al. The starcraft multi-agent challenge[J]. arXiv preprint arXiv:1902.04043, 2019.
I have a question about the results, how many samples were taken as the median? Also, how many moving averages are taken? If you could let me know, that would be great.