google-research / batch_rl

Offline Reinforcement Learning (aka Batch Reinforcement Learning) on Atari 2600 games
https://offline-rl.github.io/
Apache License 2.0
534 stars 75 forks source link

Data generation is very slow #15

Closed muzerobot closed 3 years ago

muzerobot commented 3 years ago

I am using the following command to try and generate data for one run of Freeway:

python -um batch_rl.baselines.train \
  --base_dir=/tmp/batch_rl_data \
  --gin_files='batch_rl/baselines/configs/dqn.gin'

I made no changes to the code, except for a workaround to the flags issue at #14. Each iteration is taking ~40 min, and seeing as there are 200 iterations in a single run (out of five runs), at this rate, one run will take over 5 days to generate. This is significantly longer than what is expected according to the response in #13, according to which it should take 3-5 days to generate the entire data.

Was this 3-5 day figure based on the exact config provided? I have verified that I am running on GPU, although with the default settings, I am only using 1 GB of memory. Is there anything else I can look into to see why my runs are so much slower? Thanks!

agarwl commented 3 years ago

Yes, the 3-5 days estimate was provided for this config and it is the amount of time it takes to run an online Dopamine DQN agent, the exact speed depend on the type of GPU (P100, V100) as well as the amount of RAM on your machine (I typically used 20-50 GB of RAM). I think the Dopamine Jax agents run faster than tf agents, so you might want to give them a try (The JAX agents used the numpy buffers directly).

May I ask why are you trying to regenerate the data? -- The dataset is available as raw numpy buffers here as well as a tf reverb dataset in RL Unplugged.

muzerobot commented 3 years ago

Thanks for your response!

I suppose the 3-5 day estimate was with all five runs per game done in parallel, so going by that, I suppose the speed I am experiencing is expected.

I would like the exact same Dopamine dataset, but with sticky_actions=False. You wouldn't happen to have this variant lying around somewhere, would you?

agarwl commented 3 years ago

Unfortunately, I don't have this data but if you are at DeepMind (based on your github profile affilation), then some of the folks involved in RL Unplugged did generate the Atari dataset without using sticky actions (ping @caglar and @sergomezcol for more info!).

muzerobot commented 3 years ago

Thanks!