[Bug] Pre-Trained MultiSport Spatio-Temporal Model -> KeyError: 'fps'

Branch

main branch (1.x version, such as v1.0.0, or dev-1.x branch)

Prerequisite

[X] I have searched Issues and Discussions but cannot get the expected help.
[X] I have read the documentation but cannot get the expected help.
[X] The bug has not been fixed in the latest version.

Environment

(openmmlab) D:\Work\projects-external\mmaction2>python ./mmaction/utils/collect_env.py
sys.platform: win32
Python: 3.8.19 (default, Mar 20 2024, 19:55:45) [MSC v.1916 64 bit (AMD64)]
CUDA available: True
MUSA available: False
numpy_random_seed: 2147483648
GPU 0: NVIDIA GeForce RTX 4070 Ti
CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6
NVCC: Cuda compilation tools, release 12.6, V12.6.20
MSVC: Microsoft (R) C/C++ Optimizing Compiler Version 19.41.34120 for x64
GCC: n/a
PyTorch: 2.4.1
PyTorch compiling details: PyTorch built with:
  - C++ Version: 201703
  - MSVC 192930154
  - Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v3.4.2 (Git Hash 1137e04ec0b5251ca2b4400a4fd3c667ce843d67)
  - OpenMP 2019
  - LAPACK is enabled (usually provided by MKL)
  - CPU capability usage: AVX2
  - CUDA Runtime 12.1
  - NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  - CuDNN 90.1  (built against CUDA 12.4)
  - Magma 2.5.4
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=9.1.0, CXX_COMPILER=C:/cb/pytorch_1000000000000/work/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /Zc:__cplusplus /bigobj /FS /utf-8 -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.4.1, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,

TorchVision: 0.19.1
OpenCV: 4.10.0
MMEngine: 0.10.5
MMAction2: 1.2.0+4d6c934
MMCV: 2.1.0
MMDetection: 3.3.0
MMPose: 1.3.2

Describe the bug

According to the documentation here: https://mmaction2.readthedocs.io/en/latest/user_guides/inference.html all I need to do is set up an high-level class with a configuration and get an inference result, with recognition/tsn ( suggested in the install guide for testing ) it works as expected. When I wanted to test the Spatio Temporal - pre-trained multisport model it just gives an error which I don't understand. Code sample is another way I've tried it but maonly the shell script is the one I tried.

Reproduces the problem - code sample

from operator import itemgetter
from mmaction.apis import init_recognizer, inference_recognizer

config_file = './configs/detection/slowfast/slowfast_kinetics400-pretrained-r50_8xb16-4x16x1-8e_multisports-rgb.py'
checkpoint_file = 'slowfast_kinetics400-pretrained-r50_8xb16-4x16x1-8e_multisports-rgb_20230320-af666368.pth'
video_file = 'demo.mp4'
label_file = './../tools/data/multisports/label_map.txt'
model = init_recognizer(config_file, checkpoint_file, device='cuda:0')  # or device='cuda:0'
pred_result = inference_recognizer(model, video_file)

pred_scores = pred_result.pred_score.tolist()
score_tuples = tuple(zip(range(len(pred_scores)), pred_scores))
score_sorted = sorted(score_tuples, key=itemgetter(1), reverse=True)
top5_label = score_sorted[:5]

labels = open(label_file).readlines()
labels = [x.strip() for x in labels]
results = [(labels[k[0]], k[1]) for k in top5_label]

print('The top-5 labels with corresponding scores are:')
for result in results:
    print(f'{result[0]}: ', result[1])

Reproduces the problem - command or script

python ./../demo/demo_inferencer.py ./demo.mp4 --vid-out-dir ./output/ --rec ./../configs/detection/slowfast/slowfast_kinetics400-pretrained-r50_8xb16-4x16x1-8e_multisports-rgb.py --rec-weights ./slowfast_kinetics400-pretrained-r50_8xb16-4x16x1-8e_multisports-rgb_20230320-af666368.pth --label-file ./../tools/data/multisports/label_map.txt --device cuda:0  --show --print-result --pred-out-file ./output/pred.pkl

Reproduces the problem - error message

(openmmlab) D:\Work\projects-external\mmaction2\_work>test_multisport.bat
C:\Users\DrakkLord\.conda\envs\openmmlab\lib\site-packages\mmengine\optim\optimizer\zero_optimizer.py:11: DeprecationWarning: `TorchScript` support for functional optimizers is deprecated and will be removed in a future PyTorch release. Consider using the `torch.compile` optimizer instead.
  from torch.distributed.optim import \
Loads checkpoint by local backend from path: ./slowfast_kinetics400-pretrained-r50_8xb16-4x16x1-8e_multisports-rgb_20230320-af666368.pth
C:\Users\DrakkLord\.conda\envs\openmmlab\lib\site-packages\mmengine\runner\checkpoint.py:347: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(filename, map_location=map_location)
C:\Users\DrakkLord\.conda\envs\openmmlab\lib\site-packages\mmengine\visualization\visualizer.py:196: UserWarning: Failed to add <class 'mmaction.visualization.video_backend.LocalVisBackend'>, please provide the `save_dir` argument.
  warnings.warn(f'Failed to add {vis_backend.__class__}, '
Inference ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   10/04 11:53:31 - mmengine - WARNING - "FileClient" will be deprecated in future. Please use io functions in https://mmengine.readthedocs.io/en/latest/api/fileio.html#file-io
10/04 11:53:31 - mmengine - WARNING - "HardDiskBackend" is the alias of "LocalBackend" and the former will be deprecated in future.
Inference ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Traceback (most recent call last):
  File "./../demo/demo_inferencer.py", line 70, in <module>
    main()
  File "./../demo/demo_inferencer.py", line 66, in main
    mmaction2(**call_args)
  File "d:\work\projects-external\mmaction2\mmaction\apis\inferencers\mmaction2_inferencer.py", line 161, in __call__
    preds = self.forward(ori_inputs, batch_size, **forward_kwargs)
  File "d:\work\projects-external\mmaction2\mmaction\apis\inferencers\mmaction2_inferencer.py", line 93, in forward
    predictions = self.actionrecog_inferencer(
  File "d:\work\projects-external\mmaction2\mmaction\apis\inferencers\actionrecog_inferencer.py", line 126, in __call__
    return super().__call__(
  File "C:\Users\DrakkLord\.conda\envs\openmmlab\lib\site-packages\mmengine\infer\infer.py", line 221, in __call__
    for data in (track(inputs, description='Inference')
  File "C:\Users\DrakkLord\.conda\envs\openmmlab\lib\site-packages\rich\progress.py", line 168, in track
    yield from progress.track(
  File "C:\Users\DrakkLord\.conda\envs\openmmlab\lib\site-packages\rich\progress.py", line 1210, in track
    for value in sequence:
  File "C:\Users\DrakkLord\.conda\envs\openmmlab\lib\site-packages\mmengine\infer\infer.py", line 291, in preprocess
    yield from map(self.collate_fn, chunked_data)
  File "C:\Users\DrakkLord\.conda\envs\openmmlab\lib\site-packages\mmengine\infer\infer.py", line 588, in _get_chunk_data
    processed_data = next(inputs_iter)
  File "C:\Users\DrakkLord\.conda\envs\openmmlab\lib\site-packages\mmengine\dataset\base_dataset.py", line 60, in __call__
    data = t(data)
  File "d:\work\projects-external\mmcv\mmcv\transforms\base.py", line 12, in __call__
    return self.transform(results)
  File "d:\work\projects-external\mmaction2\mmaction\datasets\transforms\loading.py", line 753, in transform
    fps = results['fps']
KeyError: 'fps'

Additional information

I've built and installed mmcv and mmaction2 locally, both latest version ( well mmcv latest that mmaction2 supports, v2.1.0 )
I've got the MultiSport pretained model checkpoint from the model zoo here: https://mmaction2.readthedocs.io/en/latest/model_zoo/detection.html

open-mmlab / mmaction2