open-mmlab / mmrotate

OpenMMLab Rotated Object Detection Toolbox and Benchmark
https://mmrotate.readthedocs.io/en/latest/
Apache License 2.0
1.85k stars 548 forks source link

IndexError: index 1 is out of bounds for axis 0 with size 0 #1072

Open lixinru77 opened 6 days ago

lixinru77 commented 6 days ago

Prerequisite

Task

I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
addict 2.4.0 pypi_0 pypi aliyun-python-sdk-core 2.15.2 pypi_0 pypi aliyun-python-sdk-kms 2.16.5 pypi_0 pypi ca-certificates 2024.7.2 h06a4308_0
certifi 2024.8.30 pypi_0 pypi cffi 1.17.1 pypi_0 pypi charset-normalizer 3.3.2 pypi_0 pypi click 8.1.7 pypi_0 pypi colorama 0.4.6 pypi_0 pypi crcmod 1.7 pypi_0 pypi cryptography 43.0.1 pypi_0 pypi cycler 0.12.1 pypi_0 pypi e2cnn 0.2.3 pypi_0 pypi filelock 3.14.0 pypi_0 pypi fonttools 4.54.1 pypi_0 pypi idna 3.10 pypi_0 pypi importlib-metadata 8.5.0 pypi_0 pypi jmespath 0.10.0 pypi_0 pypi kiwisolver 1.4.7 pypi_0 pypi ld_impl_linux-64 2.38 h1181459_1
libffi 3.4.4 h6a678d5_1
libgcc-ng 11.2.0 h1234567_1
libgomp 11.2.0 h1234567_1
libstdcxx-ng 11.2.0 h1234567_1
markdown 3.7 pypi_0 pypi markdown-it-py 3.0.0 pypi_0 pypi matplotlib 3.5.3 pypi_0 pypi mdurl 0.1.2 pypi_0 pypi mmcv-full 1.7.2 pypi_0 pypi mmdet 2.28.2 pypi_0 pypi mmrotate 0.3.4 dev_0 model-index 0.1.11 pypi_0 pypi mpmath 1.3.0 pypi_0 pypi ncurses 6.4 h6a678d5_0
numpy 1.24.4 pypi_0 pypi opencv-python 4.10.0.84 pypi_0 pypi opendatalab 0.0.10 pypi_0 pypi openmim 0.3.9 pypi_0 pypi openssl 3.0.15 h5eee18b_0
openxlab 0.1.1 pypi_0 pypi ordered-set 4.1.0 pypi_0 pypi oss2 2.17.0 pypi_0 pypi packaging 24.1 pypi_0 pypi pandas 2.0.3 pypi_0 pypi pillow 10.4.0 pypi_0 pypi pip 24.2 py38h06a4308_0
platformdirs 4.3.6 pypi_0 pypi pycocotools 2.0.7 pypi_0 pypi pycparser 2.22 pypi_0 pypi pycryptodome 3.20.0 pypi_0 pypi pygments 2.18.0 pypi_0 pypi pyparsing 3.1.4 pypi_0 pypi python 3.8.19 h955ad1f_0
python-dateutil 2.9.0.post0 pypi_0 pypi pytz 2023.4 pypi_0 pypi pyyaml 6.0.2 pypi_0 pypi readline 8.2 h5eee18b_0
requests 2.28.2 pypi_0 pypi rich 13.4.2 pypi_0 pypi scipy 1.10.1 pypi_0 pypi setuptools 60.2.0 pypi_0 pypi six 1.16.0 pypi_0 pypi sqlite 3.45.3 h5eee18b_0
sympy 1.13.3 pypi_0 pypi tabulate 0.9.0 pypi_0 pypi terminaltables 3.1.10 pypi_0 pypi tk 8.6.14 h39e8969_0
tomli 2.0.1 pypi_0 pypi torch 1.8.0+cu111 pypi_0 pypi torchaudio 0.8.0 pypi_0 pypi torchvision 0.9.0+cu111 pypi_0 pypi tqdm 4.65.2 pypi_0 pypi typing-extensions 4.12.2 pypi_0 pypi tzdata 2024.2 pypi_0 pypi urllib3 1.26.20 pypi_0 pypi wheel 0.44.0 py38h06a4308_0
xz 5.4.6 h5eee18b_1
yapf 0.40.2 pypi_0 pypi zipp 3.20.2 pypi_0 pypi zlib 1.2.13 h5eee18b_1

Reproduces the problem - code sample

I run this command: python tools/train.py /media/newmy/code_env/mmrotate-main/configs/rotated_retinanet/rotated_retinanet_hbb_r50_fpn_1x_dota_le90.py

1.the dota.py class DOTADataset(CustomDataset): """DOTA dataset for detection.

Args:
    ann_file (str): Annotation file path.
    pipeline (list[dict]): Processing pipeline.
    version (str, optional): Angle representations. Defaults to 'oc'.
    difficulty (bool, optional): The difficulty threshold of GT.
"""

CLASSES = ('Car', 'Bus', 'Truck', 'Van')

PALETTE = [(165, 42, 42), (189, 183, 107), (0, 255, 0), (255, 0, 0)]

2.the dotav1.py

dataset settings

dataset_type = 'DOTADataset'

data_root = '/media/newmy/code_env/mmrotate-main/data/split_1024_dota1_0/'

data_root = '/media/newmy/code_env/mmrotate-main/data/split_1024_dota1_0/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='RResize', img_scale=(1024, 1024)), dict(type='RRandomFlip', flip_ratio=0.5), dict(type='Normalize', img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1024, 1024), flip=False, transforms=[ dict(type='RResize'), dict(type='Normalize', img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img']) ]) ] data = dict( samples_per_gpu=1, workers_per_gpu=0, train=dict( type=dataset_type, ann_file=data_root + 'train/annfiles', img_prefix=data_root + 'train/images',

ann_file=data_root + 'train/labelTxt-v1.0/labelTxt/',

    # img_prefix=data_root + 'train/images/',
    pipeline=train_pipeline),
val=dict(
    type=dataset_type,
    ann_file=data_root + 'val/annfiles',
    img_prefix=data_root + 'val/images',
    # ann_file=data_root + 'val/labelTxt-v1.0/labelTxt/',
    # img_prefix=data_root + 'val/images/',
    pipeline=test_pipeline),
test=dict(
    type=dataset_type,
    ann_file=data_root + 'test/annfiles',
    img_prefix=data_root + 'test/images',
    # ann_file=data_root + 'test/part1/images/',
    # img_prefix=data_root + 'test/part2/images',
    pipeline=test_pipeline))
  1. rotated_retinanet_obb_r50_fpn_1x_dota_le90.py i chang the num_classes=4, other is not change

Reproduces the problem - command or script

python tools/train.py /media/newmy/code_env/mmrotate-main/configs/rotated_retinanet/rotated_retinanet_hbb_r50_fpn_1x_dota_le90.py

Reproduces the problem - error message

1.Traceback (most recent call last): File "tools/train.py", line 194, in main() File "tools/train.py", line 183, in main train_detector( File "/media/newmy/code_env/mmrotate-main/mmrotate/apis/train.py", line 144, in train_detector runner.run(data_loaders, cfg.workflow) File "/home/lixinru/anaconda3/envs/mmrotate/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run epoch_runner(data_loaders[i], **kwargs) File "/home/lixinru/anaconda3/envs/mmrotate/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 49, in train for i, data_batch in enumerate(self.data_loader): File "/home/lixinru/anaconda3/envs/mmrotate/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 517, in next data = self._next_data() File "/home/lixinru/anaconda3/envs/mmrotate/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 557, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/home/lixinru/anaconda3/envs/mmrotate/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/lixinru/anaconda3/envs/mmrotate/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/lixinru/anaconda3/envs/mmrotate/lib/python3.8/site-packages/mmdet/datasets/custom.py", line 220, in getitem data = self.prepare_train_img(idx) File "/home/lixinru/anaconda3/envs/mmrotate/lib/python3.8/site-packages/mmdet/datasets/custom.py", line 243, in prepare_train_img return self.pipeline(results) File "/home/lixinru/anaconda3/envs/mmrotate/lib/python3.8/site-packages/mmdet/datasets/pipelines/compose.py", line 41, in call data = t(data) File "/home/lixinru/anaconda3/envs/mmrotate/lib/python3.8/site-packages/mmdet/datasets/pipelines/transforms.py", line 472, in call results[key] = self.bbox_flip(results[key], File "/media/newmy/code_env/mmrotate-main/mmrotate/datasets/pipelines/transforms.py", line 89, in bbox_flip flipped[:4] = flipped[[1, 0, 3, 2]].copy() IndexError: index 1 is out of bounds for axis 0 with size 0

  1. i debug this code,find that this bboxs is null,i didn't known why?,print the bboxs is [ ],but i print the dotav1.py : def bbox_flip(self, bboxes, img_shape, direction): """Flip bboxes horizontally or vertically.

    Args:
        bboxes(ndarray): shape (..., 5*k)
        img_shape(tuple): (height, width)
    
    Returns:
        numpy.ndarray: Flipped bounding boxes.
    """
    assert bboxes.shape[-1] % 5 == 0
    orig_shape = bboxes.shape
    bboxes = bboxes.reshape((-1, 5))
    # print("###################################")
    # print("flipped is:", bboxes )
    # print("###################################")
    flipped = bboxes.copy()
    if direction == 'horizontal':
        flipped[:, 0] = img_shape[1] - bboxes[:, 0] - 1
        # print("###################################")
        # print("flipped is:",flipped)
        # print("###################################")
        # flipped[:4] = flipped[[1, 0, 3, 2]].copy()
    elif direction == 'vertical':
        flipped[:, 1] = img_shape[0] - bboxes[:, 1] - 1
        flipped[:4] = flipped[[1, 0, 3, 2]].copy()
    elif direction == 'diagonal':
        flipped[:, 0] = img_shape[1] - bboxes[:, 0] - 1
        flipped[:, 1] = img_shape[0] - bboxes[:, 1] - 1
        return flipped.reshape(orig_shape)
    else:
        raise ValueError(f'Invalid flipping direction "{direction}"')
    if self.version == 'oc':
        rotated_flag = (bboxes[:, 4] != np.pi / 2)
        flipped[rotated_flag, 4] = np.pi / 2 - bboxes[rotated_flag, 4]
        flipped[rotated_flag, 2] = bboxes[rotated_flag, 3]
        flipped[rotated_flag, 3] = bboxes[rotated_flag, 2]
    else:
        flipped[:, 4] = norm_angle(np.pi - bboxes[:, 4], self.version)
    return flipped.reshape(orig_shape)

3.but i print the dota.py ,it canbe print the annotation: def load_annotations(self, ann_folder): """ Args: ann_folder: folder that contains DOTA v1 annotations txt files """ cls_map = {c: i for i, c in enumerate(self.CLASSES) } # in mmdet v2.0 label is 0-based ann_files = glob.glob(ann_folder + '/*.txt')

    data_infos = []
    if not ann_files:  # test phase
        ann_files = glob.glob(ann_folder + '/*.png')
        for ann_file in ann_files:
            data_info = {}
            img_id = osp.split(ann_file)[1][:-4]
            img_name = img_id + '.png'
            data_info['filename'] = img_name
            data_info['ann'] = {}
            data_info['ann']['bboxes'] = []
            data_info['ann']['labels'] = []
            data_infos.append(data_info)
    else:
        for ann_file in ann_files:
            data_info = {}
            img_id = osp.split(ann_file)[1][:-4]
            img_name = img_id + '.png'
            data_info['filename'] = img_name
            data_info['ann'] = {}
            gt_bboxes = []
            gt_labels = []
            gt_polygons = []
            gt_bboxes_ignore = []
            gt_labels_ignore = []
            gt_polygons_ignore = []

            if os.path.getsize(ann_file) == 0 and self.filter_empty_gt:
                continue

            with open(ann_file) as f:
                s = f.readlines()
                 print(s)   我打印了这里( i print this comment)
                for si in s:
                    bbox_info = si.split()
                    # print(bbox_info)
                    poly = np.array(bbox_info[:8], dtype=np.float32)
                    # print(poly)
                    try:
                        x, y, w, h, a = poly2obb_np(poly, self.version)
                    except:  # noqa: E722
                        continue
                    cls_name = bbox_info[8]
                    # print(cls_name)
                    difficulty = int(bbox_info[9])
                    label = cls_map[cls_name]
                    if difficulty > self.difficulty:
                        pass
                    else:
                        gt_bboxes.append([x, y, w, h, a])
                        gt_labels.append(label)
                        gt_polygons.append(poly)

Additional information

the dataset annotation i use is just like: 242 231 253 208 312 236 301 258 Car 0 186 209 195 187 250 208 242 231 Car 0 468 417 477 396 536 419 527 440 Car 0 68 545 75 522 124 539 116 561 Truck 0 63 487 75 463 138 494 126 518 Truck 0 453 226 460 204 513 224 505 245 Car 0

i check the anno ,it is correct mmrotate-main/data/split_1024_dota1_0: train/val/test: each have annfiles (the file is include txt file) and images (the file is include png pictures) mmrotate mmrotate3

lixinru77 commented 4 days ago

i print the result of the image, the images is 0,but something other is correct: the transformer.py: def call(self, results): print("打印:result") print(results) """Call function to flip bounding boxes, masks, semantic segmentation maps.

    Args:
        results (dict): Result dict from loading pipeline.

    Returns:
        dict: Flipped results, 'flip', 'flip_direction' keys are added \
            into result dict.
    """

    if 'flip' not in results:
        if isinstance(self.direction, list):
            # None means non-flip
            direction_list = self.direction + [None]
        else:
            # None means non-flip
            direction_list = [self.direction, None]

        if isinstance(self.flip_ratio, list):
            non_flip_ratio = 1 - sum(self.flip_ratio)
            flip_ratio_list = self.flip_ratio + [non_flip_ratio]
        else:
            non_flip_ratio = 1 - self.flip_ratio
            # exclude non-flip
            single_ratio = self.flip_ratio / (len(direction_list) - 1)
            flip_ratio_list = [single_ratio] * (len(direction_list) -
                                                1) + [non_flip_ratio]

        cur_dir = np.random.choice(direction_list, p=flip_ratio_list)

        results['flip'] = cur_dir is not None
        print("打印:results['flip']")
        print(results['flip'])
        print("#######################################")
    if 'flip_direction' not in results:
        results['flip_direction'] = cur_dir
    if results['flip']:
        # flip image
        for key in results.get('img_fields', ['img']):
            results[key] = mmcv.imflip(
                results[key], direction=results['flip_direction'])
            print("打印这里:results[key])")
            print(results[key])
            print(results['flip_direction'])
            print("#######################################")
        # flip bboxes
        for key in results.get('bbox_fields', []):
            results[key] = self.bbox_flip(results[key],
                                          results['img_shape'],
                                          results['flip_direction'])
        # flip masks
        for key in results.get('mask_fields', []):
            results[key] = results[key].flip(results['flip_direction'])

        # flip segs
        for key in results.get('seg_fields', []):
            results[key] = mmcv.imflip(
                results[key], direction=results['flip_direction'])
    return results

the result is,my pciture is no bad: 打印:result {'img_info': {'filename': 'Atrain17.png', 'ann': {'bboxes': array([[ 3.0100e+02, 4.6850e+02, 5.8938e+01, 2.2074e+01, -5.1678e-02], [ 3.6356e+02, 4.4974e+02, 5.6511e+01, 2.5951e+01, -2.4498e-01], [ 3.4250e+02, 5.2650e+02, 6.0597e+01, 2.3195e+01, 1.4382e+00]], dtype=float32), 'labels': array([0, 0, 0]), 'polygons': array([[271., 459., 329., 456., 331., 478., 273., 481.], [333., 444., 388., 431., 394., 455., 340., 469.], [327., 498., 350., 495., 358., 555., 335., 558.]], dtype=float32), 'bboxes_ignore': array([], shape=(0, 5), dtype=float32), 'labels_ignore': array([], dtype=int64), 'polygons_ignore': array([], shape=(0, 8), dtype=float32)}}, 'ann_info': {'bboxes': array([[ 3.0100e+02, 4.6850e+02, 5.8938e+01, 2.2074e+01, -5.1678e-02], [ 3.6356e+02, 4.4974e+02, 5.6511e+01, 2.5951e+01, -2.4498e-01], [ 3.4250e+02, 5.2650e+02, 6.0597e+01, 2.3195e+01, 1.4382e+00]], dtype=float32), 'labels': array([0, 0, 0]), 'polygons': array([[271., 459., 329., 456., 331., 478., 273., 481.], [333., 444., 388., 431., 394., 455., 340., 469.], [327., 498., 350., 495., 358., 555., 335., 558.]], dtype=float32), 'bboxes_ignore': array([], shape=(0, 5), dtype=float32), 'labels_ignore': array([], dtype=int64), 'polygons_ignore': array([], shape=(0, 8), dtype=float32)}, 'img_prefix': '/media/newmy/code_env/mmrotate-main/data/example_data/train/images', 'seg_prefix': None, 'proposal_file': None, 'bbox_fields': ['gt_bboxes_ignore', 'gt_bboxes'], 'mask_fields': [], 'seg_fields': [], 'filename': '/media/newmy/code_env/mmrotate-main/data/example_data/train/images/Atrain17.png', 'ori_filename': 'Atrain17.png', 'img': array([[[0, 0, 0], [0, 0, 0], [0, 0, 0], ..., [0, 0, 0], [0, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 0], [0, 0, 0], ..., [0, 0, 0], [0, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 0], [0, 0, 0], ..., [0, 0, 0], [0, 0, 0], [0, 0, 0]], ..., [[0, 0, 0], [0, 0, 0], [0, 0, 0], ..., [0, 0, 0], [0, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 0], [0, 0, 0], ..., [0, 0, 0], [0, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 0], [0, 0, 0], ..., [0, 0, 0], [0, 0, 0], [0, 0, 0]]], dtype=uint8), 'img_shape': (1024, 1024, 3), 'ori_shape': (640, 640, 3), 'img_fields': ['img'], 'gt_bboxes': array([[ 4.8160e+02, 7.4960e+02, 9.4302e+01, 3.5318e+01, -5.1678e-02], [ 5.8169e+02, 7.1958e+02, 9.0417e+01, 4.1522e+01, -2.4498e-01], [ 5.4800e+02, 8.4240e+02, 9.6955e+01, 3.7112e+01, 1.4382e+00]], dtype=float32), 'gt_bboxes_ignore': array([], shape=(0, 5), dtype=float32), 'gt_labels': array([0, 0, 0]), 'scale': (1024, 1024), 'scale_idx': 0, 'pad_shape': (1024, 1024, 3), 'scale_factor': array([1.6, 1.6, 1.6, 1.6], dtype=float32), 'keep_ratio': True}