Open shuyuandeqipa opened 3 years ago
Are you annealing epsilon over 50k or over 1mil timesteps? For the results in Figure 2 of the paper, epsilon is annealed over 50k timesteps.
CUDA_VISIBLE_DEVICES=3 nohup python3 -u src/main.py --config=vdn_smac --env-config=pred_prey_punish with epsilon_anneal_time=1000000 use_tensorboard=True > ./wjx_logs_1211/vdn_smac_pred_prey_punish_tensorboard_V2.log 2>&1 &
Yes, epsilon_anneal_time=1000000 ! We can not get the similar results of vdn, qmix, and qplex in figure 2.
Is my parameter setting wrong?
Yeah, for Figure 2 in the paper set epsilon_anneal_time=50000 (or remove it altogether since 50k is the default).
It seems that setting it as 1mil helps the performance (https://openreview.net/forum?id=Rcmk0xxIQV Appendix K.2 show similar results to yours I think).
Thanks for your help!
-----------------------------------------------
Methods pred_prey_punish [test_return_mean in the log files]
OW-QMIX (w=0.1) 36.8333
OW-QMIX (w=0.5) 36.5000
CW-QMIX (w=0.1) 37.5417 (37.6667)
CW-QMIX (w=0.5) 37.5417 (36.9583)
QTRAN 38.0833
QPLEX 36.1667
QMIX 33.6250
COMA 0.0000
VDN 36.7083 ( 35.7500)
-----------------------------------------------
VDN 37.0833 run 1
VDN 36.1250 run 2
VDN 36.4167 run 3
QMIX 38.0417 run 1
QMIX 30.2083 run 2
QMIX 36.0000 run 3
QPLEX 36.7083 run 1
QPLEX 30.3333 run 2
QPLEX 24.5417 run 3
MADDPG 0
MASAC 0
-----------------------------------------------