gulvarol / bsl1k

BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues, ECCV 2020
https://www.robots.ox.ac.uk/~vgg/research/bsl1k/
76 stars 10 forks source link

Problem facing while extracting i3d feature for each individual video in Phoenix14T dataset #15

Open rabeya-akter opened 11 months ago

rabeya-akter commented 11 months ago

I am trying to extract I3D features for Phoenix14T dataset. I have downloaded the dataset and also ran this code to make the videos: python misc/phoenix2014/gather_frames.py

Now, for extracting I3D features for the videos I ran this command in the command line: python main.py --featurize_mode 1 --num-classes 1233 --include_embds 1 --datasetname phoenix2014 --checkpoint checkpoint/phoenix2014t_i3d_pbsl1k --pretrained checkpoint/phoenix2014/T_c1233_ctc_blank/checkpoint_006.pth.tar --test_set test --evaluate_video 1 --word_data_pkl misc/bsl1k/bsl1k_vocab.pkl --phoenix_path data/PHOENIX-2014-T-release-v3/PHOENIX-2014-T --num_in_frames 16 --feature_dim 1024 --inp_res 224 --resize_res 256 --workers 0 --save_features 1

As instructed --include_embs 1 is set, so features can be saved. A file named features.mat is created in the folder: bsl1k-master/checkpoint/phoenix2014/T_c1233_ctc_blank/test_006/features.mat

and the features.mat has this keys: dict_keys(['header', 'version', 'globals', 'preds', 'clip_gt', 'clip_ix', 'video_names'])

when I checked the preds, I found a shape of: features['preds'].shape -(7723, 1024)

But I want I3D feature for each individual videos in the train, test and val in the shape of (Frame Size, 1024). What should I change? Please guide me.

Safaeid48 commented 11 months ago

facing the same issue