for loop can not break out when enumerating data_list in dataset_analysis.py

Prerequisite

[X] I have searched the existing and past issues but cannot get the expected help.
[X] I have read the FAQ documentation but cannot get the expected help.
[X] The bug has not been fixed in the latest version.

🐞 Describe the bug

This for loop could not break out when all elements have been looped. When I modified the codes here as below:

    progress_bar = ProgressBar(len(dataset))
    cnt = 0
    print(len(data_list))
    for img in data_list:
        for instance in img['instances']:
            if instance[
                    'bbox_label'] in classes_idx and args.class_name is None:
                class_num[instance['bbox_label']] += 1
                class_bbox[instance['bbox_label']].append(instance['bbox'])
            elif instance['bbox_label'] in classes_idx and args.class_name:
                class_num[0] += 1
                class_bbox[0].append(instance['bbox'])
        progress_bar.update()
        cnt += 1
        if cnt == len(data_list):
            print('enumerate over!', '\n'*10)

Run the command

python tools/analysis_tools/dataset_analysis.py configs/custom_dataset/yolov5_s-v61_syncbn_fast_1xb32-50e_ionogram.py                                                 --output-dir output

I got output on the terminal:

loading annotations into memory...
Done (t=0.12s)
creating index...
index created!

Print current running information:
+--------------------------------------------------------------------+
|                        Dataset information                         |
+---------------+-------------+--------------+-----------------------+
|  Dataset type |  Class name |   Function   |       Area rule       |
+---------------+-------------+--------------+-----------------------+
| train_dataset | All classes | All function | [0, 32, 96, 100000.0] |
+---------------+-------------+--------------+-----------------------+

Read the information of each picture in the dataset:
[                                                  ] 0/3019, elapsed: 0s, ETA:3019
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 3019/3019, 11762.0 task/s, elapsed: 0s, ETA:     0senumerate over! 

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 7955/3019, 9119.1 task/s, elapsed: 1s, ETA:     0sTraceback (most recent call last):
  File "tools/analysis_tools/dataset_analysis.py", line 508, in <module>
    main()
  File "tools/analysis_tools/dataset_analysis.py", line 454, in main
    progress_bar.update()
  File "/home/ubuntu/.conda/envs/openmmlab/lib/python3.8/site-packages/mmengine/utils/progressbar.py", line 56, in update
    self.file.write(msg.format(bar_chars))
KeyboardInterrupt

I think my data set has no problems, because if I add a break:

        if cnt == len(data_list):
            print('enumerate over!', '\n'*10)
            break

The rest of codes runs successfully. I can also train on my custom dataset.

Environment

sys.platform: linux Python: 3.8.15 (default, Nov 24 2022, 15:19:38) [GCC 11.2.0] CUDA available: True numpy_random_seed: 2147483648 GPU 0: Tesla V100-SXM2-32GB CUDA_HOME: :/usr/local/cuda-11.4:/usr/local/cuda-11.4 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.12.1 PyTorch compiling details: PyTorch built with:

GCC 9.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.3
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.3.2 (built against CUDA 11.5)
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.13.1 OpenCV: 4.6.0 MMEngine: 0.3.2 MMCV: 2.0.0rc3 MMDetection: 3.0.0rc4 MMYOLO: 0.2.0+27487fd

Additional information

my config file mmyolo/configs/custom_dataset/yolov5_s-v61_syncbn_fast_1xb32-50e_ionogram.py

_base_ = '../yolov5/yolov5_s-v61_syncbn_fast_8xb16-300e_coco.py'

max_epochs = 50  # 训练的最大 epoch
data_root = './Iono4311/'  # 数据集目录的绝对路径

work_dir = './work_dirs/yolov5_s_50e'

# 因为本教程是在 cat 数据集上微调，故这里需要使用 `load_from` 来加载 MMYOLO 中的预训练模型，这样可以在加快收敛速度的同时保证精度
load_from = './work_dirs/yolov5_s-v61_syncbn_fast_8xb16-300e_coco_20220918_084700-86e02187.pth'  # noqa

# 根据自己的 GPU 情况，修改 batch size，YOLOv5-s 默认为 8卡 x 16bs
train_batch_size_per_gpu = 32
train_num_workers = 4  # 推荐使用 train_num_workers = nGPU x 4

save_epoch_intervals = 5  # 每 interval 轮迭代进行一次保存一次权重

# 根据自己的 GPU 情况，修改 base_lr，修改的比例是 base_lr_default * (your_bs 32 / default_bs (8x16))
base_lr = _base_.base_lr / 4

# anchors = [  # 此处已经根据数据集特点更新了 anchor，关于 anchor 的生成，后面小节会讲解
#     [(68, 69), (154, 91), (143, 162)],  # P3/8
#     [(242, 160), (189, 287), (391, 207)],  # P4/16
#     [(353, 337), (539, 341), (443, 432)]  # P5/32
# ]

anchors = [
    [[8, 6], [24, 4], [19, 9]],
    [[22, 19], [17, 49], [29, 45]],
    [[44, 66], [96, 76], [126, 59]]
]

class_name = ('E', 'Es-l', 'Es-c', 'F1', 'F2', 'Spread-F')  # 根据 class_with_id.txt 类别信息，设置 class_name
num_classes = len(class_name)

metainfo = dict(
    CLASSES = class_name,
    PALETTE = [(250, 165, 30), (120, 69, 125), (53, 125, 34), (0, 11, 123), (130, 20, 12), (120, 121, 80)]  # 画图时候的颜色，随便设置即可
)

train_cfg = dict(
    max_epochs=max_epochs,
    val_begin=10,  # 第几个 epoch 后验证，这里设置 20 是因为前 20 个 epoch 精度不高，测试意义不大，故跳过
    val_interval=save_epoch_intervals  # 每 val_interval 轮迭代进行一次测试评估
)

model = dict(
    bbox_head=dict(
        head_module=dict(num_classes=num_classes),
        prior_generator=dict(base_sizes=anchors),

        # loss_cls 会根据 num_classes 动态调整，但是 num_classes = 1 的时候，loss_cls 恒为 0
        loss_cls=dict(loss_weight=0.5 *
                      (num_classes / 80 * 3 / _base_.num_det_layers))))

train_dataloader = dict(
    batch_size=train_batch_size_per_gpu,
    num_workers=train_num_workers,
    dataset=dict(
        _delete_=True,
        type='RepeatDataset',
        # 数据量太少的话，可以使用 RepeatDataset ，在每个 epoch 内重复当前数据集 n 次，这里设置 5 是重复 5 次
        times=1,
        dataset=dict(
            type=_base_.dataset_type,
            data_root=data_root,
            metainfo=metainfo,
            ann_file='annotations/train.json',
            data_prefix=dict(img='images/'),
            filter_cfg=dict(filter_empty_gt=False, min_size=32),
            pipeline=_base_.train_pipeline)))

val_dataloader = dict(
    dataset=dict(
        metainfo=metainfo,
        data_root=data_root,
        ann_file='annotations/val.json',
        data_prefix=dict(img='images/')))

test_dataloader = val_dataloader

val_evaluator = dict(ann_file=data_root + 'annotations/val.json')
test_evaluator = val_evaluator

optim_wrapper = dict(optimizer=dict(lr=base_lr))

default_hooks = dict(
    # 设置间隔多少个 epoch 保存模型，以及保存模型最多几个，`save_best` 是另外保存最佳模型（推荐）
    checkpoint=dict(
        type='CheckpointHook',
        interval=save_epoch_intervals,
        max_keep_ckpts=5,
        save_best='auto'),
    param_scheduler=dict(max_epochs=max_epochs),
    # logger 输出的间隔 (每个batch)
    logger=dict(type='LoggerHook', interval=50))

visualizer = dict(vis_backends=[dict(type='LocalVisBackend'), dict(type='WandbVisBackend')])
# visualizer = dict(vis_backends=[dict(type='LocalVisBackend'),dict(type='TensorboardVisBackend')])

open-mmlab / mmyolo