Open ByunghyunYoo opened 1 year ago
I'm trying to reproduce your results, but the win rate do not increase in the Corridor scenario. (I think "dfop" means MACPF) I haven't changed your code at all, and in the code of the version you uploaded, Alpha and Alpha_{i} are fixed to 0.001, so it appears to be the same as in the setting of Corridor scenario you described in the paper. Do I have an additional parameter for the Corridor scenario? The figures attached are the results of Corridor scenario that I reproduced and the config file for MACPF (dfop in the code).
Hi, sorry for late reply cause I don't really check this repo very often. Can you provide some details about the software envoriment you used for reproduction? Like torch version, cuda version? For results in the paper, I use torch 1.7.1+cu110.
I'm trying to reproduce your results, but the win rate do not increase in the Corridor scenario. (I think "dfop" means MACPF) I haven't changed your code at all, and in the code of the version you uploaded, Alpha and Alpha_{i} are fixed to 0.001, so it appears to be the same as in the setting of Corridor scenario you described in the paper. Do I have an additional parameter for the Corridor scenario? The figures attached are the results of Corridor scenario that I reproduced and the config file for MACPF (dfop in the code).
I tried this code again on another machine with torch 1.11.0+cu113, it works fine there (at least it achieves non zero win rate for most seeds), so I think it is not that picky with software envoriment, so maybe I do need more details to figure out what the problem is.
Another potential problem is the version of SC2, I use SC2 4.10 in my paper, so if you are using 4.6, the performace may vary a lot.
Hi @RetiaAdolf
I'm facing a similar issue replicating the results in the paper. I ran the experiment with 8m_vs_9m using the default configuration, but the performance lags behind QMIX quite a lot. After 500k steps, the win-rate is only around 0.1, whereas the paper reports over 80% at this point. Could you share the complete configuration files used in the paper?
Additionally, I'm curious as to why the code does not support parallel running, given that the number of parallel threads is also an important hyperparameter which can significantly impact performance (see the qmix_high_sample_efficiency
in pymarl2
, where the training thread is set to 4, lower than normal QMIX). It also runs much faster with parallel threads.
I'm trying to reproduce your results, but the win rate do not increase in the Corridor scenario. (I think "dfop" means MACPF) I haven't changed your code at all, and in the code of the version you uploaded, Alpha and Alpha_{i} are fixed to 0.001, so it appears to be the same as in the setting of Corridor scenario you described in the paper. Do I have an additional parameter for the Corridor scenario? The figures attached are the results of Corridor scenario that I reproduced and the config file for MACPF (dfop in the code).