fpv-iplab / rulstm

Code for the Paper: Antonino Furnari and Giovanni Maria Farinella. What Would You Expect? Anticipating Egocentric Actions with Rolling-Unrolling LSTMs and Modality Attention. International Conference on Computer Vision, 2019.
http://iplab.dmi.unict.it/rulstm
132 stars 33 forks source link

Difference between data and data_full #7

Closed yxgz closed 4 years ago

yxgz commented 4 years ago

Hi, thank you for providing pre-extracted features. I’m confused about the difference between data and data_full. According to my own understanding, features downloaded from data_full are extracted from each frame (corresponding to 30fps) while features downloaded from data are extracted from each frame (corresponding to 4fps). Is that right? And how did you sample frames at 4fps? Was it performed after all the videos were converted to 30fps? Could you please provide more details? Thank you!

antoninofurnari commented 4 years ago

Hello, thank you for your interest in our work!

All features have been extracted at 30fps. To do so, we first converted all videos to this fixed framerate using the following command:

ffmpeg -i input.mp4 -c:v libx264 -crf 22 -r 30 -vsync cfr -an output.mp4

We extracted features from all frames of the converted videos and stored them into data_full.

To obtain data, we just discarded all frames which were not sampled during training, validation or testing by our method. In practice, we sampled 16 frames at 4fps before the beginning of each action. Please note that this does not correspond to a fixed framerate of 4fps as we align frames to the starting time-stamp of each action.

We provided data to reduce the download size, but I suggest to use data_full if you are implementing your own sampling scheme.

This does not correspond to a uniform framerate of 4fps