[Bug] case_analyzer.py run error&&TypeError: 'siqaDataset_V2' object is not subscriptable

Prerequisite

[X] I have searched Issues and Discussions but cannot get the expected help.
[X] The bug has not been fixed in the latest version.

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

{'CUDA available': True, 'CUDA_HOME': None, 'GCC': 'gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0', 'GPU 0': 'NVIDIA GeForce GTX 1050 Ti', 'MMEngine': '0.10.4', 'MUSA available': False, 'OpenCV': '4.9.0', 'PyTorch': '2.3.0', 'PyTorch compiling details': 'PyTorch built with:\n' ' - GCC 9.3\n' ' - C++ Version: 201703\n' ' - Intel(R) oneAPI Math Kernel Library Version ' '2023.1-Product Build 20230303 for Intel(R) 64 ' 'architecture applications\n' ' - Intel(R) MKL-DNN v3.3.6 (Git Hash ' '86e6af5974177e513fd3fee58425e1063e7f1361)\n' ' - OpenMP 201511 (a.k.a. OpenMP 4.5)\n' ' - LAPACK is enabled (usually provided by ' 'MKL)\n' ' - NNPACK is enabled\n' ' - CPU capability usage: AVX2\n' ' - CUDA Runtime 12.1\n' ' - NVCC architecture flags: ' '-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90\n' ' - CuDNN 8.9.2\n' ' - Magma 2.6.1\n' ' - Build settings: BLAS_INFO=mkl, ' 'BUILD_TYPE=Release, CUDA_VERSION=12.1, ' 'CUDNN_VERSION=8.9.2, ' 'CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, ' 'CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 ' '-fabi-version=11 -fvisibility-inlines-hidden ' '-DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO ' '-DLIBKINETO_NOROCTRACER -DUSE_FBGEMM ' '-DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK ' '-DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE ' '-O2 -fPIC -Wall -Wextra -Werror=return-type ' '-Werror=non-virtual-dtor -Werror=bool-operation ' '-Wnarrowing -Wno-missing-field-initializers ' '-Wno-type-limits -Wno-array-bounds ' '-Wno-unknown-pragmas -Wno-unused-parameter ' '-Wno-unused-function -Wno-unused-result ' '-Wno-strict-overflow -Wno-strict-aliasing ' '-Wno-stringop-overflow -Wsuggest-override ' '-Wno-psabi -Wno-error=pedantic ' '-Wno-error=old-style-cast -Wno-missing-braces ' '-fdiagnostics-color=always -faligned-new ' '-Wno-unused-but-set-variable ' '-Wno-maybe-uninitialized -fno-math-errno ' '-fno-trapping-math -Werror=format ' '-Wno-stringop-overflow, LAPACK_INFO=mkl, ' 'PERF_WITH_AVX=1, PERF_WITH_AVX2=1, ' 'PERF_WITH_AVX512=1, TORCH_VERSION=2.3.0, ' 'USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=1, ' 'USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, ' 'USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, ' 'USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, ' 'USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, ' 'USE_ROCM_KERNEL_ASSERT=OFF, \n', 'Python': '3.10.14 (main, Mar 21 2024, 16:24:04) [GCC 11.2.0]', 'TorchVision': '0.18.0', 'numpy_random_seed': 2147483648, 'opencompass': '0.2.4+81d0e4d', 'sys.platform': 'linux'}

Reproduces the problem - code/configuration sample

https://opencompass-zh-cn.readthedocs.io/zh-cn/latest/tools.html Case Analyzer 本工具在已有评测结果的基础上，产出推理错误样本以及带有标注信息的全量样本。运行方式： python tools/case_analyzer.py CONFIG_PATH [-w WORK_DIR] -w：工作路径，默认为 './outputs/default'。 '-w', '--work-dir', help='Work path, all the outputs will be ' 'saved in this path, including the slurm logs, ' 'the evaluation results, the summary results, etc.' 'If not specified, the work_dir will be set to ' './outputs/default.',

Reproduces the problem - command or script

python tools/case_analyzer.py configs/eval_demo.py

Reproduces the problem - error message

Traceback (most recent call last): File "/home/qwq/opencompass/tools/case_analyzer.py", line 201, in main() # 调用主函数 File "/home/qwq/opencompass/tools/case_analyzer.py", line 198, in main dispatch_tasks(cfg, force=args.force) # 分派任务 File "/home/qwq/opencompass/tools/case_analyzer.py", line 188, in dispatch_tasks }).run() File "/home/qwq/opencompass/tools/case_analyzer.py", line 101, in run references = dataset[self.ds_column] TypeError: 'siqaDataset_V2' object is not subscriptable

Other information

1What's your expected result?：How to use the case_analyzer.py to list bad case or good case？ 2What dataset did you use?：siqa_gen&&winograd_ppl ，case form “python run.py --models hf_opt_125m --datasets siqa_gen winograd_ppl --debug” I'm really confused. I hope to get help from developers or others！

open-compass / opencompass