Sequential Data - Githubissues

arjung128 commented 3 years ago

Hi!

I was wondering if there is a way to get sequential data, i.e. o_t in iteration 1 corresponds to o_tm1 in iteration 2?

This doesn't seem to be the case in the RL Unplugged codebase, even when all instances of .shuffle() are commented out. I also tried reading the data directly using tfrecord with the following code:

dataset = tf.data.TFRecordDataset(['rl_unplugged/tmp/atari/Gravitar/run_1-00000-of-00100'],
                                                            compression_type='GZIP')
iterator = iter(dataset)

inputs_1 = next(iterator)
inputs_1 = _tf_example_to_reverb_sample_np(inputs_1)
o_t, a_t, r_t, d_t, o_tp1, a_tp1, extras = inputs_1.data

inputs_2 = next(iterator)
inputs_2 = _tf_example_to_reverb_sample_np(inputs_2)
o_t2, a_t2, r_t2, d_t2, o_tp12, a_tp12, extras2 = inputs_2.data

However, here as well, o_tp1 != o_t2, which is what I'm looking for. Is this possible?

Thanks!

agarwl commented 3 years ago

Yes, it should be possible to load sequential data. For the Dopamine version of the dataset, the data (which is simply gzipped numpy files) was stored in the order it was collected by the agent (so entire trajectories can be read directly using the numpy replay buffer provided in Dopamine. You just need to create a circular_replay_buffer.WrappedReplayBuffer and load an appropriate checkpoint using the load function.

For RL Unplugged, there was some processing done on the datasets and the ordering wasn't preserved. CC'ing @caglar and @sergomezcol who worked on releasing the Atari dataset as tfrecord files for RL Unplugged if they have any plans for releasing the sequential data for Atari or if it can be done already.

agarwl commented 3 years ago

Just an FYI but the sequential Atari datasets can now directly be loaded from tensorflow_datasets from https://www.tensorflow.org/datasets/catalog/rlu_atari.

A simple snippet to get started:

import matplotlib.pyplot as plt
import tensorflow_datasets as tfds

ds = tfds.load('rlu_atari/Alien_run_1', split='train')
data = next(iter(ds))
step = next(iter(data['steps']))

plt.imshow(next(iter(data['steps']))['observation'])

google-research / batch_rl

Sequential Data #12