mlii / mfrl

Mean Field Multi-Agent Reinforcement Learning
MIT License
374 stars 100 forks source link

How you compare performance of algorithms in the battle game? #8

Open woaipichuli opened 6 years ago

woaipichuli commented 6 years ago

Recently, I have run the train_battle.py and battle.py. However, I found the performance of one algorithms may vary quite obviously in the battle game. I consider this is quite normal. As the performance of models saved in the 2000th generation may not be the best. Besides, with different random seeds, final performance of models obtained after the self-play process should be different. In this case, I wonder how you evaluate their performance and gain results in Fig.8. How many independent runs you have made for each algorithm in the comparative battles.

woaipichuli commented 6 years ago

I wonder why the simulation results output by this program is different from that in the paper. Eg: The performance of MFAC outperforms MFQ obviously in this program, which is totally opposite to the results in the paper.

lzh-awesome commented 2 years ago

I wonder why the simulation results output by this program is different from that in the paper. Eg: The performance of MFAC outperforms MFQ obviously in this program, which is totally opposite to the results in the paper.

@woaipichuli hi,i have the same situation as you.i find even AC performs better than MFAC,and MFAC get more reward than MFQ.that's completely oppsite to the ressults in the paper.

QianZhao-xd commented 1 year ago

I wonder why the simulation results output by this program is different from that in the paper. Eg: The performance of MFAC outperforms MFQ obviously in this program, which is totally opposite to the results in the paper.

@woaipichuli hi,i have the same situation as you.i find even AC performs better than MFAC,and MFAC get more reward than MFQ.that's completely oppsite to the ressults in the paper.

Have you found a solution to the problem? I had the same problem

lzh-awesome commented 1 year ago

I wonder why the simulation results output by this program is different from that in the paper. Eg: The performance of MFAC outperforms MFQ obviously in this program, which is totally opposite to the results in the paper.

@woaipichuli hi,i have the same situation as you.i find even AC performs better than MFAC,and MFAC get more reward than MFQ.that's completely oppsite to the ressults in the paper.

Have you found a solution to the problem? I had the same problem

I don't use this code at present. I think it can be explained that the performance results of MFAC and MFQ are different from those mentioned in the paper, such as from the perspective of algorithm structure, model training episode, random seed number of algorithm training, operating equipment, etc. It can also be explained that the AC effect is better than the MFAC effect. The possible reason is that when the interaction between agents becomes complex, agents must act cooperatively according to the global state, which means that the local approximation assumption in the mean field method is not clear, leading to a significant decline in performance.