google-research / batch_rl

Offline Reinforcement Learning (aka Batch Reinforcement Learning) on Atari 2600 games
https://offline-rl.github.io/
Apache License 2.0
520 stars 73 forks source link

How to get expert data ? #31

Closed prajjwal1 closed 1 year ago

prajjwal1 commented 1 year ago

I have read all the issues. Thanks for responding to them. I had a query regarding extracting expert data from the replay buffer. As you've specified in one of the issues, due to the size of the data, 50M datapoints from the game have been split into 50 files of 1M datapoints each. I wanted to ask, does this mean that the last 10 files in [GAME_NAME]/1/replay_logs/ (buffer files ending (having suffix) with 41, 42...50) represent expert behavior while the first buffer files (ending with 1, 2,..10) represent beginner level performance ? I tried searching about my question and found this in the AI blog For example, the first k million frames from the DQN replay dataset emulate exploration data with suboptimal returns while the last k million frames are analogous to near-expert data with stochasticity. So going by what is being said, by considering the buffer files that end with 40-50, is it safe to assume that I'm extracting expert level behavior ?

agarwl commented 1 year ago

Yes, the last 10 files correspond to the last 40M frames seen by a DQN agent trained for 200M frames. For expert data, I'd simply load the last buffer. For example, see this paper which did expert data selection on the Atari datasets: https://openreview.net/forum?id=AP1MKT37rJ

Also, checkpoint_49 can also be easily extracted from the tfds version of this datasets (easy to plug and play): https://colab.research.google.com/github/google-research/rlds/blob/main/rlds/examples/tfds_rlu_atari.ipynb

prajjwal1 commented 1 year ago

Thank you.

prajjwal1 commented 1 year ago

Had a follow-up question. Does each file represent 1M frames or 4M frames ? In the other issue, you said this "So, the way the dataset is stored is in 50 files of size 1M each", and from the comment above, I assume each file represents 4M frames. Can you please clarify which one is it ?

agarwl commented 1 year ago

1M transitions= 4M frames in Atari due to action repeat of 4. Each file contains 1M transitions, which correspond to 4M frames.

prajjwal1 commented 1 year ago

Thanks again for clarifying things up.