[RL Unplugged] Is the data stored chronologically? Or how to retrive sequential data?

google-deepmind / deepmind-research

This repository contains implementations and illustrative code to accompany DeepMind publications

Apache License 2.0

12.95k stars 2.55k forks source link

Hi,

Thanks for your nice work. I want to train an RNN agent with sequential data and I tested it with dm_control_suite. The paper mentioned that: "For sequence data, we also provide future states, actions, and rewards, which allows for training recurrent models for tasks requiring memory." But I can only get data with format (s,a,r,s', a') even I set sarsa=False when calling dataset() function as what you did in the dm_control_suite_crr.py, which also uses an RNN agent. Also, I printed out the data you fed into the CRRLearner, it's also in sarsa format with shape [batch_size, 2, state_size]. And I tried the data loader without calling shuffle(), the retrieved data is still shuffled.

So I want to know: 1. is the data stored chronologically? 2. can we get sequential data to train an RNN agent?

Thanks a lot!

RL Unplugged does not have sequential data.

https://github.com/google-research/batch_rl/issues/12

Also, I'm not sure it even has terminal states. If so, they don't appear to be randomly distributed.

I would recommend looking at the batch_rl repo https://github.com/google-research/batch_rl/ I think that one has trajectory info.

Personally, I'd prefer a raw dataset of sequential observations without pre-processing, frame stacking etc for a benchmark. Data augmentation turned out to be incredibly important in supervised learning, and I don't see why high dimensional batch RL will be any different. Also, model based RL usually exploits causality. so it won't be possible to test model based RL vs model free on RL unplugged.

google-deepmind / deepmind-research

[RL Unplugged] Is the data stored chronologically? Or how to retrive sequential data? #187