HumanCompatibleAI / eirli

An Empirical Investigation of Representation Learning for Imitation (EIRLI), NeurIPS'21
https://arxiv.org/pdf/2205.07886.pdf
36 stars 4 forks source link

Imitation Learning Baseline Code #78

Open nileshop22 opened 2 years ago

nileshop22 commented 2 years ago

Hi @qxcv, first of all thanks a lot of open-sourcing the codebase for your amazing work. The codebase is indeed huge, I was wondering if this repository contains code for end-to-end Imitation Learning without any Representation Learning (i.e. w/o Pre-training/Join Training).

Also, do you have a small dataset so that I can check if it works on my end in small scale?

I also see that you have provided two datasets, can you please explain which involves which tasks?

Thanks!

RPC2 commented 2 years ago

Hi,

Thanks a lot for your interests in our work! For running imitation learning without any pertaining / joint training, here is one example:

CUDA_VISIBLE_DEVICES=0 xvfb-run -a python src/il_representations/scripts/pretrain_n_adapt.py with \
 cfg_repl_none \
 cfg_il_bc_nofreeze \
 tune_run_kwargs.num_samples=5 \
 il_train.bc.n_batches=300000 \
 il_train.bc.batch_size=512 \
 env_cfg.benchmark_name=procgen \
 env_cfg.task_name=coinrun

What essentially matters for your question is cfg_repl_none, which specifies that no representation learning will be used. You can also optionally configure the number of experiments you want to run(num_samples), imitation-specific parameters (n_batches, batch_size), and also environment configs.

Regarding the small dataset, you can find some under eirli/tests/data/processed/demos/.

@qxcv Would you mind checking on the task content within those two datasets?

Hope this helps!

qxcv commented 2 years ago

The task names should be in the directory names. This is what the directory looks like for me:

tests/data/processed/demos/
├── atari
│   └── PongNoFrameskip-v4
│       └── demos.tgz
├── dm_control
│   └── reacher-easy
│       └── demos.tgz
├── magical
│   └── MoveToRegion-Demo-v0
│       └── demos.tgz
├── minecraft
│   └── NavigateVectorObf
│       └── demos.tgz
└── procgen
    └── coinrun
        └── demos.tgz

(task names are PongNoFrameskip, reache-easy, MoveToRegion-Demo-v0, etc.)

bchen0 commented 2 years ago

Hi @qxcv, I had a few questions on the data I was hoping you could answer.

First, under data/processed/demos I see only two folders (dm_control, magical) and a 1kb file for procgen - do you know why this might be the case? I don't seem to have Minecraft, Atari, or Procgen, although I mostly care about Procgen.

image

And also, could you explain what the difference between data/processed/demos and data/processed/random is?

Thanks!

qxcv commented 2 years ago

Hi @bchen0, where did you download the data from? I might have accidentally made the archiver skip symlinks, or something like that, in which case I should re-upload it!

The /demos folder is for expert (either human or RL) demonstrations. The /random folder is for random rollouts (we use a saved random rollouts file instead of making new random rollouts each time to make the training process a little less stochastic).

bchen0 commented 2 years ago

I downloaded it from here: https://berkeley.app.box.com/s/8yo3yyyh0h2e1ay5iehbnyg4g0cm0lpe.

Thanks for the explanation on /demos and /random - makes sense!

qxcv commented 2 years ago

Okay, I checked those files and it looks like I accidentally uploaded a symlink instead of the real procgen demos. I'll make a separate archive for the missing procgen files tomorrow and upload that as well (it's late evening for me now).

qxcv commented 2 years ago

I uploaded the missing Procgen demos to Box: https://berkeley.app.box.com/s/8yo3yyyh0h2e1ay5iehbnyg4g0cm0lpe

bchen0 commented 2 years ago

Thanks - appreciate the quick response!