facebookresearch / torchbeast

A PyTorch Platform for Distributed RL
Apache License 2.0
738 stars 114 forks source link

Cannot reproduce the performance of "SpaceInvaders" game? #25

Closed jmkim0309 closed 3 years ago

jmkim0309 commented 3 years ago

Hi,

Thank you for this great codebase. I ran this code to reproduce the performance of the following 6 atari tasks: {AirRaid, Carnival, DemonAttack, NameThisGame, Pong, SpaceInvaders}. However, compared to the mean_episode_returns reported in the curves in the README, my experiment shows huge performance drop ONLY on SpaceInvaders (about x10 lower), while the other 5 tasks are reasonably reproduced. This problem is also reported here, in Appendix C. Why is that?

In my experiments, I used the same hyperparameters for all tasks. e.g. python -m torchbeast.monobeast --env SpaceInvadersNoFrameskip-v4 --num_actors 56 --total_steps 50000000 --learning_rate 0.0006 --epsilon 0.01 --entropy_cost 0.01 --batch_size 32 --unroll_length 20 --num_threads 1 --xpid SpaceInvaders

heiner commented 3 years ago

Hey Kim,

My apologies for only responding now, and thanks for your kind words.

The results reported in the README and in the TorchBeast paper were produced with PolyBeast. The MonoBeast version you are using has the upside of being simpler to install and run, but uses a different design that impacts RL performance in hard to understand ways. So if you want to get performance comparable to the one in the README, you'll have to use PolyBeast.