google-deepmind / deepmind-research

This repository contains implementations and illustrative code to accompany DeepMind publications
Apache License 2.0
12.95k stars 2.55k forks source link

[RL Unplugged] Is the data stored chronologically? Or how to retrive sequential data? #187

Open zhaoyi11 opened 3 years ago

zhaoyi11 commented 3 years ago

Hi,

Thanks for your nice work. I want to train an RNN agent with sequential data and I tested it with dm_control_suite. The paper mentioned that: "For sequence data, we also provide future states, actions, and rewards, which allows for training recurrent models for tasks requiring memory." But I can only get data with format (s,a,r,s', a') even I set sarsa=False when calling dataset() function as what you did in the dm_control_suite_crr.py, which also uses an RNN agent. Also, I printed out the data you fed into the CRRLearner, it's also in sarsa format with shape [batch_size, 2, state_size]. And I tried the data loader without calling shuffle(), the retrieved data is still shuffled.

So I want to know: 1. is the data stored chronologically? 2. can we get sequential data to train an RNN agent?

Thanks a lot!

DuaneNielsen commented 3 years ago

RL Unplugged does not have sequential data.

https://github.com/google-research/batch_rl/issues/12

Also, I'm not sure it even has terminal states. If so, they don't appear to be randomly distributed.

I would recommend looking at the batch_rl repo https://github.com/google-research/batch_rl/ I think that one has trajectory info.

Personally, I'd prefer a raw dataset of sequential observations without pre-processing, frame stacking etc for a benchmark. Data augmentation turned out to be incredibly important in supervised learning, and I don't see why high dimensional batch RL will be any different. Also, model based RL usually exploits causality. so it won't be possible to test model based RL vs model free on RL unplugged.