open-mmlab / mmocr

OpenMMLab Text Detection, Recognition and Understanding Toolbox
https://mmocr.readthedocs.io/en/dev-1.x/
Apache License 2.0
4.18k stars 735 forks source link

When training svtr,use dataset :Each image sample should have one text annotation only,get :Each image sample should have one text annotation only #2037

Open Dark1Forest opened 2 months ago

Dark1Forest commented 2 months ago

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmocr

Environment

Package Version


addict 2.4.0 asynctest 0.13.0 attrs 23.2.0 boto3 1.34.79 botocore 1.34.79 certifi 2024.2.2 charset-normalizer 3.3.2 clearml 1.15.0 click 8.1.7 codecov 2.1.13 colorama 0.4.6 contourpy 1.2.1 coverage 7.4.4 cycler 0.12.1 exceptiongroup 1.2.0 flake8 7.0.0 fonttools 4.51.0 furl 2.1.3 idna 3.6 imageio 2.34.0 imgaug 0.4.0 importlib_metadata 7.1.0 iniconfig 2.0.0 interrogate 1.5.0 isort 5.13.2 jmespath 1.0.1 jsonschema 4.21.1 jsonschema-specifications 2023.12.1 kiwisolver 1.4.5 kwarray 0.6.18 lanms_neo 1.0.2 lazy_loader 0.4 lmdb 1.4.1 markdown-it-py 3.0.0 matplotlib 3.8.4 mccabe 0.7.0 mdurl 0.1.2 mmcv 2.0.1 mmdet 3.1.0 mmengine 0.10.3 mmocr 1.0.1 networkx 3.3 numpy 1.26.4 opencv-python 4.9.0.80 orderedmultidict 1.0.1 packaging 24.0 parameterized 0.9.0 pathlib2 2.3.7.post1 pillow 10.3.0 pip 23.3.1 platformdirs 4.2.0 pluggy 1.4.0 psutil 5.9.8 py 1.11.0 pyclipper 1.3.0.post5 pycocotools 2.0.7 pycodestyle 2.11.1 pyflakes 3.2.0 Pygments 2.17.2 PyJWT 2.8.0 pyparsing 3.1.2 pytest 8.1.1 pytest-cov 5.0.0 pytest-runner 6.0.1 python-dateutil 2.9.0.post0 PyYAML 6.0.1 rapidfuzz 3.8.0 referencing 0.34.0 requests 2.31.0 rich 13.7.1 rpds-py 0.18.0 s3transfer 0.10.1 scikit-image 0.22.0 scipy 1.13.0 setuptools 68.2.2 shapely 2.0.3 six 1.16.0 tabulate 0.9.0 termcolor 2.4.0 terminaltables 3.1.10 tifffile 2024.2.12 toml 0.10.2 tomli 2.0.1 torch 1.13.1+cu117 torchaudio 0.13.1+cu117 torchvision 0.14.1+cu117 tqdm 4.66.2 typing_extensions 4.11.0 ubelt 1.3.5 urllib3 2.2.1 wheel 0.41.2 xdoctest 1.1.3 yapf 0.40.2 zipp 3.18.1

Reproduces the problem - code sample

_base_ = [
    '_base_svtr-tiny.py',
    '../_base_/default_runtime.py',
    # '../_base_/datasets/mjsynth.py',
    # '../_base_/datasets/synthtext.py',
    # '../_base_/datasets/cute80.py',
    # '../_base_/datasets/iiit5k.py',
    # '../_base_/datasets/svt.py',
    # '../_base_/datasets/svtp.py',
    # '../_base_/datasets/icdar2013.py',
    # '../_base_/datasets/icdar2015.py',
    '../_base_/schedules/schedule_adam_base.py',
]

train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=20, val_interval=1)

optim_wrapper = dict(
    type='OptimWrapper',
    optimizer=dict(
        type='AdamW',
        lr=5 / (10 ** 4) * 2048 / 2048,
        betas=(0.9, 0.99),
        eps=8e-8,
        weight_decay=0.05))

param_scheduler = [
    dict(
        type='LinearLR',
        start_factor=0.5,
        end_factor=1.,
        end=2,
        verbose=False,
        convert_to_iter_based=True),
    dict(
        type='CosineAnnealingLR',
        T_max=19,
        begin=2,
        end=20,
        verbose=False,
        convert_to_iter_based=True),
]

# dataset settings
# train_list = [_base_.mjsynth_textrecog_train, _base_.synthtext_textrecog_train]
# test_list = [
#     _base_.cute80_textrecog_test, _base_.iiit5k_textrecog_test,
#     _base_.svt_textrecog_test, _base_.svtp_textrecog_test,
#     _base_.icdar2013_textrecog_test, _base_.icdar2015_textrecog_test
# ]

# val_evaluator = dict(
#     dataset_prefixes=[])
# test_evaluator = val_evaluator

# train_pipeline = [
#     dict(type='LoadImageFromFile'),
#     dict(type='LoadKIEAnnotations'),
#     dict(type='Resize', scale=(512, 1024), keep_ratio=True),
#     dict(type='PackKIEInputs')
# ]
# test_pipeline = [
#     dict(type='LoadImageFromFile'),
#     dict(type='LoadKIEAnnotations'),
#     dict(type='Resize', scale=(512, 1024), keep_ratio=True),
#     dict(type='PackKIEInputs', meta_keys=('img_path',)),
# ]

wildreceipt_openset_train = [
    dict(
        type='WildReceiptDataset',
        metainfo=dict(category=[
            dict(id=0, name='relic_entry'),
            dict(id=1, name='relic_entry_main'),
            dict(id=2, name='relic_level'),
            dict(id=3, name='relic_parts'),
            dict(id=3, name='relic_suit')
        ]),
        ann_file='/data/dataset/mmdetection/HonkaiKIELabel/result_w.json',
        # pipeline=train_pipeline
        pipeline=None,
    )
]

wildreceipt_openset_test = [
    dict(
        type='WildReceiptDataset',
        metainfo=dict(category=[
            dict(id=0, name='relic_entry'),
            dict(id=1, name='relic_entry_main'),
            dict(id=2, name='relic_level'),
            dict(id=3, name='relic_parts'),
            dict(id=3, name='relic_suit')
        ]),
        ann_file='/data/dataset/mmdetection/HonkaiKIELabel/result_w.json',
        test_mode=True,
        # pipeline=test_pipeline
        pipeline=None,
    )
]

train_dataloader = dict(
    batch_size=512,
    num_workers=24,
    persistent_workers=True,
    pin_memory=True,
    sampler=dict(type='DefaultSampler', shuffle=True),
    dataset=dict(
        type='ConcatDataset',
        datasets=wildreceipt_openset_train,
        # pipeline=train_pipeline
        pipeline=_base_.test_pipeline
    )
)

val_dataloader = dict(
    batch_size=128,
    num_workers=8,
    persistent_workers=True,
    pin_memory=True,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=dict(
        type='ConcatDataset',
        datasets=wildreceipt_openset_test,
        pipeline=_base_.test_pipeline
    )
)

test_dataloader = val_dataloader

Reproduces the problem - command or script

CUDA_VISIBLE_DEVICES=0 python tools/train.py /mnt/project/mmocr/configs/textrecog/svtr/svtr-tiny_20e_st_mji.py --amp

Reproduces the problem - error message

Traceback (most recent call last): File "/mnt/project/mmocr/tools/train.py", line 114, in main() File "/mnt/project/mmocr/tools/train.py", line 110, in main runner.train() File "/data/conda/envs/mmocr/lib/python3.10/site-packages/mmengine/runner/runner.py", line 1777, in train model = self.train_loop.run() # type: ignore File "/data/conda/envs/mmocr/lib/python3.10/site-packages/mmengine/runner/loops.py", line 96, in run self.run_epoch() File "/data/conda/envs/mmocr/lib/python3.10/site-packages/mmengine/runner/loops.py", line 111, in run_epoch for idx, data_batch in enumerate(self.dataloader): File "/data/conda/envs/mmocr/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 628, in next data = self._next_data() File "/data/conda/envs/mmocr/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1333, in _next_data return self._process_data(data) File "/data/conda/envs/mmocr/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data data.reraise() File "/data/conda/envs/mmocr/lib/python3.10/site-packages/torch/_utils.py", line 543, in reraise raise exception AssertionError: Caught AssertionError in DataLoader worker process 0. Original Traceback (most recent call last): File "/data/conda/envs/mmocr/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "/data/conda/envs/mmocr/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/data/conda/envs/mmocr/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 58, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/data/conda/envs/mmocr/lib/python3.10/site-packages/mmengine/dataset/dataset_wrapper.py", line 171, in getitem return self.datasets[dataset_idx][sample_idx] File "/data/conda/envs/mmocr/lib/python3.10/site-packages/mmengine/dataset/base_dataset.py", line 410, in getitem data = self.prepare_data(idx) File "/data/conda/envs/mmocr/lib/python3.10/site-packages/mmengine/dataset/base_dataset.py", line 793, in prepare_data return self.pipeline(data_info) File "/data/conda/envs/mmocr/lib/python3.10/site-packages/mmengine/dataset/base_dataset.py", line 60, in call data = t(data) File "/data/conda/envs/mmocr/lib/python3.10/site-packages/mmcv/transforms/base.py", line 12, in call return self.transform(results) File "/data/conda/envs/mmocr/lib/python3.10/site-packages/mmocr/datasets/transforms/formatting.py", line 202, in transform assert len( AssertionError: Each image sample should have one text annotation only

Additional information

WildReceiptDataset