open-mmlab / mmaction2

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
https://mmaction2.readthedocs.io
Apache License 2.0
4.06k stars 1.2k forks source link

[Bug] How can I use clip_feature_extraction.py to input the rawframes , and then output the RGB features of each frame? #2777

Open Chenhongchang opened 6 months ago

Chenhongchang commented 6 months ago

Branch

main branch (1.x version, such as v1.0.0, or dev-1.x branch)

Prerequisite

Environment

sys.platform: linux Python: 3.7.7 (default, May 7 2020, 21:25:33) [GCC 7.3.0] CUDA available: True numpy_random_seed: 2147483648 GPU 0: NVIDIA GeForce RTX 3090 CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.4, V11.4.152 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.7.1+cu110 PyTorch compiling details: PyTorch built with:

TorchVision: 0.8.2+cu110 OpenCV: 4.8.1 MMEngine: 0.10.1 MMAction2: 1.2.0+4d6c934 MMCV: 2.1.0 MMDetection: 3.2.0 MMPose: 1.2.0

Describe the bug

I followed the documentation and went with Option 2 to build my custom dataset. To study how to extract the feature, my dateset include 135 RGB frames which extract from a video in THUMOS14. When I reached step 6, I used tools\misc\clip_feature_extraction.py to extract features from my rawframes dataset. However, I encountered an issue: it produce a .pkl file of 2048 (feature dimension) x 1 (number of features) rather than 2048 (feature dimension) x 135 (number of features equals to number of frames ). How can I use clip_feature_extraction.py to input the rawframes , and then output the RGB features of each frame?

Reproduces the problem - code sample

No response

Reproduces the problem - command or script

the following code is the conmmad which I executed

python /home/bit/mmaction2/tools/misc/clip_feature_extraction.py 
/home/bit/mmaction2/configs/recognition/tsn/testa.py 
/home/bit/mmaction2/tsn_r50_320p_1x1x8_50e_activitynet_clip_rgb_20210301-c0f04a7e.pth 
/home/bit/mmaction2/data/rgb_feat 
--video-list /home/bit/mmaction2/data/test/test.txt
--video-root /home/bit/mmaction2/data/test/rawframes 

test.txt is following

/home/bit/mmaction2/data/test/rawframes/video_test_0000004 135

Reproduces the problem - error message

it produce a .pkl file of 2048 (feature dimension) x 1 (number of features) rather than 2048 (feature dimension) x 135 (number of features equals to number of frames ). How can I use clip_feature_extraction.py to input the rawframes , and then output the RGB features of each frame?

Additional information

No response