epic-kitchens / epic-kitchens-55-lib

:coffee: EPIC-KITCHENS-55 dataset python library
https://epic-kitchens.readthedocs.io/en/latest/index.html
30 stars 6 forks source link

Example code to gulp dataset with manual downloading #57

Closed tqvinhcs closed 5 years ago

tqvinhcs commented 5 years ago

Is there any example code to gulp dataset with manual downloading. For example, i have downloaded all the tar files for rgb and flow. I have also extracted them into frames. But I don't know how to gulp it.

willprice commented 5 years ago

Hi, There are examples of how to use the CLI tools here: https://epic-kitchens.readthedocs.io/en/latest/cli_tools.html#gulp-data-ingestor

willprice commented 5 years ago

You could also take a look at the action-recognition-starter-kit which has a Snakemake file that will download, split and gulp the frames for you.

tqvinhcs commented 5 years ago

Hi, Thank you so much for your reply. I know these codes exists. But the problem is there is no guide for the input/output format. For example:

$ python -m epic_kitchens.preprocessing.split_segments \
    P03 \
    path/to/frames \
    path/to/frame-segments \
    path/to/labels.pkl \
    RGB \
    --fps 60 \
    --frame-format 'frame_%010d.jpg' \
    --of-stride 2 \
    --of-dilation 3

1) Will this code split the segment for the whole dataset or for videos in sub dir of P03 only? 2) Depend on (1), what should be the appropriate paths should be used for the path/to/frames. 3) path/to/frame-segments: is this the output directory? 4) path/to/labels.pkl : What file should I put here? Is it "EPIC_train_action_labels.pkl"? 5) Can we gulp the test set of seen and unseen kitchens 'EPIC_test_s1_timestamps.pkl', 'EPIC_test_s2_timestamps.pkl'?

It seems that the Snakemake will download an process each video at a time. However, the download speed is pretty slow for each tar file ~700kbps. That why I had to download the dataset manually.

I am sorry if I miss-reading something on the instruction.

willprice commented 5 years ago

Hi @tqvinhcs,

The idea is to run Snakemake with a number of jobs, it also has cluster support. It builds a dependency graph so can parallelize as much as the tasks can (which is a lot... basically it fans out to 432 jobs, the same as the number of the videos, and then fans back in to gulp the data).

To answer your questions

  1. Will this code split the segment for the whole dataset or for videos in sub dir of P03 only?

This will split all segments for P03

  1. Depend on (1), what should be the appropriate paths should be used for the path/to/frames.

You can see how split_segments is called in https://github.com/epic-kitchens/action-recognition-starter-kit-private/blob/master/Snakefile#L123 Basically you want to call it per participant directory. If you have the layout (for RGB frames):

root_dir
|------- P01
|            |------- frame_0000000001.jpg

Then you want to call it like so:

$ python -m epic_kitchens.preprocessing.split_segments \
    P01 \
    root_dir/P01 \
    output_dir/P01 \
    EPIC_train_action_labels.pkl \
    RGB \
    --fps 60 \
    --frame-format 'frame_%010d.jpg'
  1. path/to/frame-segments: is this the output directory?

Correct

  1. path/to/labels.pkl : What file should I put here? Is it "EPIC_train_action_labels.pkl"?

If you wish to split the training examples, yes, if you want to split the test set, then use EPIC_test_s1_timestamps.pkl (or s2 if you're splitting the unseen test set)

  1. Can we gulp the test set of seen and unseen kitchens 'EPIC_test_s1_timestamps.pkl', 'EPIC_test_s2_timestamps.pkl'?

Yes. You simply have to split the frames into segments, and then gulp them in a similar fashion to what you'll do for training dataset, but replace the EPIC_train_action_labels.pkl file with one of the EPIC_test_s1/2_timestamps.pkl files.

In addition, you can run

$ python -m epic_kitchens.preprocessing.split_segments --help

To get information on how to use the program (likewise for the gulping program)

willprice commented 5 years ago

I would recommend using the snakemake file as this automates the entire process end to end from downloading frames to splitting them to gulping them. You can invoke snakemake with a number of jobs -j <n_jobs>. If you have already downloaded the data you can use this without having to redownload it by putting it when snakemake would otherwise download it: "data/raw/{modality}/{participant_id}/{train_video_id}.tar" where modality is either rgb or flow, participant_id is of the form P01 and train_video_id of the format P01_01.

tqvinhcs commented 5 years ago

Hi Will, Thank you for your reply. It helps a lot. I have gulped the dataset already.

1) I guess this lib will not work directly for the action-anticipation task? I have to modified it? 2) Can the EpicVideoDataset class work with dataloader and transformation of the GulpIO? Or we have to load frame segments like in the examples here: https://github.com/epic-kitchens/starter-kit-action-recognition/blob/master/notebooks/2.0-gulp.ipynb

willprice commented 5 years ago

If you've gulped the dataset already then you're free to use the GulpIO GulpVideoDataset class, or you can use ours. You'll have to do a bit more label manipulation if you use theirs as we wrap up the example-to-label mapping so you don't have to do much.

You can pass a sample_transform in when constructing EpicVideoDataset which can be anything that operates on a list of PIL images (if you want to use the transforms from GulpIO you'll have to convert them back to np.ndarrays by calling np.array on each of the PIL Images (note: I don't know how costly this is, I've had a look at some of the internals of PIL and I don't think it will involve any data copying, but I'm not certain.)

willprice commented 5 years ago

For action anticipation you're on your own ;) I work in action recognition. Maybe @antoninofurnari might be able to give you a few tips, although he didn't use this library in his code when we wrote the paper.

antoninofurnari commented 5 years ago

As Will said, I'm not familiar with this library. I simply downloaded all frames and flows, subsampled to 30fps and dumped everything in two big LMDB files (50GB) without splitting the videos into segments (that is not helpful for anticipation).