Breakout performance - timesteps/time?

pavitrakumar78 commented 7 years ago

Hi,

I have been trying to train your a3c model on breakout game.

I am running it on a p2.xlarge instance and the params I am using are:

tmax: 40m num_concurrent: 16 everything else is default

I have changed the network from a NIPS network to a NATURE network (has to give better performance).

I have a few questions that I wanted to ask! :)

So far, the model has been running for 4m global timesteps (self.T) /300k actor_learner_thread timesteps (counter). In your results, you have posted your breakout results after "10m" timesteps - which timestep are you referring to? is the local one or the global one?
I am training on a NVIDIA K80 (p2.xl instance), I am only able to run about 330k global time steps (self.T) in 1 hr - isn't that a bit slow? any comments on this?
Do you have any saved plot/tensorboard data for training/testing you have done?

Thank you!

papoudakis commented 7 years ago

Hi, The 10M steps I am referring are the global steps

Yes, 330k/hour is slow but using a GPU wont speed up the training a lot. Try using fewer actor learner threads.

Unfortunately I do not have any saved results or parameters

Thanks for your comment Since I am new to RL every feedback is welcome

pavitrakumar78 commented 7 years ago

Thank you for your reply!

I am new to RL too and thus I have lots of questions! :)

I will try to train using a c4x instance.

Also, for your breakout model which you uploaded to open-ai gym leaderboard - do have any information about the parameters that you used? like the number of actors etc.. I see that you have set the default for actors for a3c and a3c_lstm as 8 - so you used default parameters to achieve that score?

papoudakis commented 7 years ago

Yes I used the default parameters

papoudakis commented 7 years ago

Hi, In order to make this faster just change checkpoint_interval flag to 1 or 2 million. It will become a lot faster.

papoudakis / Asynchronous_RL

Breakout performance - timesteps/time? #1