About the activitynet features.

sqiangcao99 commented 2 years ago

Hi,

For ActivityNet pre-trained model, which two configuration files are used for rgb and flow?

xumingze0308 commented 2 years ago

We used the "clip_rgb" ones. You can also directly use the feature from TeSTra.

sqiangcao99 commented 2 years ago

@xumingze0308. Hi, thanks for your help. Currently, I have some proplems reproducing the resulting the results(6% lower) on TVSeries using the ActivityNet pretrained features. Specifically,

I extract the rgb frames with resolution of short-side 320 and flows with resolution of 340X256;

The rgb frames and flow frams are preprocessed according to the config file of mmaction2, which is

# RGB
data_pipeline = [
    dict(type='RawFrameDecode'), 
    dict(type='CenterCrop', crop_size=256),
    dict(type='Normalize', **args.img_norm_cfg),
    dict(type='FormatShape', input_format='NCHW'),
    dict(type='Collect', keys=['imgs'], meta_keys=[]),
    dict(type='ToTensor', keys=['imgs']), 
]

data_pipeline = [
    dict(type='RawFrameDecode'),
    dict(type='Resize', scale=(-1, 256)),
    dict(type='TenCrop', crop_size=224),
    dict(type='Normalize', **args.img_norm_cfg),
    dict(type='FormatShape', input_format='NCHW_Flow'),
    dict(type='Collect', keys=['imgs'], meta_keys=[]),
    dict(type='ToTensor', keys=['imgs'])
]

The extracting code is modified based on https://github.com/open-mmlab/mmaction2/blob/master/tools/data/activitynet/tsn_feature_extraction.py.

Is the whole process correct?

xumingze0308 commented 2 years ago

For both rgb and flow, we don't conduct CenterCrop, but directly use the default resolution.

amazon-science / long-short-term-transformer

About the activitynet features. #14