fpv-iplab / rulstm

Code for the Paper: Antonino Furnari and Giovanni Maria Farinella. What Would You Expect? Anticipating Egocentric Actions with Rolling-Unrolling LSTMs and Modality Attention. International Conference on Computer Vision, 2019.
http://iplab.dmi.unict.it/rulstm
132 stars 33 forks source link

Availability on Features of EGTEA Gaze+ Dataset #4

Closed hellocv closed 4 years ago

hellocv commented 5 years ago

Hello, thanks for your hard work! It really provide great convenience for reimplementation.

Meanwhile, I would like to know if the pre-extracted features of EGTEA Gaze+ will be available for downloading as those of Epic-Kitchens do? Thanks a lot.

antoninofurnari commented 5 years ago

Hello, I'm not sure I still have the pre-computed features on EGTEA Gaze+ (have to check), but I think I can share the models. I'll try to address this soon, but this is a very busy time at the moment.

Thanks, Antonino

hellocv commented 4 years ago

Hello, I'm not sure I still have the pre-computed features on EGTEA Gaze+ (have to check), but I think I can share the models. I'll try to address this soon, but this is a very busy time at the moment.

Thanks, Antonino

Hello Antonino, thank you for your latest update on the feature extraction scripts.

There is still some questions I want to ask. Do you use the same model which extracts features of EPIC-Kitchens to extract the features on EGTEA Gaze+? In addition, when you convert EGTEA Gaze+ video using ffmpeg, is the framerate 30 fps?

Thank you.

antoninofurnari commented 4 years ago

Hello, we used separate models to extract features for EPIC-Kitchens and EGTEA Gaze+. Each model is a TSN trained for action recognition on the related dataset.

In the case of EGTEA Gaze+, we extracted image frames from the video at 30fps with the following ffmpeg command ffmpeg -i $f -vf "scale=-1:256,fps=30" -qscale:v 2 frames/${f}_frame_%010d.jpg, where $f is the video filename. We extracted optical flow with TVL1.

We hence trained TSN for egocentric action recognition on EGTEA Gaze+ to recognize the 106 classes. After that, we extracted features (you can refer to our examples).

For object-based features, we used the Faster R-CNN trained on EPIC-Kitchens. Though, we noted only a minor improvement in performance as the objects are not very well aligned to the action labels.

Unfortunately, for the moment I am not able to share the features or the models used for EGTEA Gaze+ as they are buried in some backup I do not have my hands on now. However, the training/feature extraction process should be straightforward.

Antonino

hellocv commented 4 years ago

Hello Antonino,

Thank you for your answer. Your help is greatly appreciated.

antoninofurnari commented 4 years ago

Hello, It took me some time, but I managed to rescue the features extracted from EGTEA Gaze+. You can find them here: https://iplab.dmi.unict.it/sharing/rulstm/features/egtea.zip.

Please also see the README for more information about these features.