open-mmlab / mmocr

OpenMMLab Text Detection, Recognition and Understanding Toolbox
https://mmocr.readthedocs.io/en/dev-1.x/
Apache License 2.0
4.35k stars 751 forks source link

[Bug] TypeError: 'Polygon' object is not iterable, training ocr detection model mmocr==1.0.0 #1889

Closed mazharsaif closed 1 year ago

mazharsaif commented 1 year ago

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmocr

Environment

addict==2.4.0 altair==4.2.0 apted==1.0.3 arabic-reshaper==2.1.3 argon2-cffi==20.1.0 asgiref==3.4.1 asynctest==0.13.0 attrs==23.1.0 backcall==0.2.0 backports.zoneinfo==0.2.1 base58==2.1.1 beautifulsoup4==4.10.0 bleach==4.1.0 bs4==0.0.1 certifi==2022.12.7 charset-normalizer==3.1.0 click==7.1.2 cmake==3.26.1 codecov==2.1.13 colorama==0.4.6 comm==0.1.3 coverage==7.2.5 cycler==0.10.0 Cython==0.29.34 debugpy==1.6.7 decorator==4.4.2 defusedxml==0.7.1 Distance==0.1.3 easyocr==1.3.2 editdistance==0.6.0 efficientnet==1.0.0 entrypoints==0.3 essential-generators==1.0 fastapi==0.68.0 filelock==3.6.0 flake8==6.0.0 Flask==2.0.1 fonttools==4.26.2 future==0.18.2 gitdb==4.0.9 idna==3.4 imagecorruptions==1.1.2 imageio==2.9.0 imgaug==0.3.0 importlib-metadata==6.6.0 iniconfig==1.1.1 interrogate==1.5.0 ipykernel==6.22.0 ipython==7.26.0 ipython-genutils==0.2.0 ipywidgets==7.6.5 isort==5.12.0 itsdangerous==2.0.1 jarowinkler==1.0.2 jedi==0.18.0 joblib==1.0.1 jsonlines==3.0.0 jsonschema==3.2.0 jupyter-contrib-core==0.3.3 jupyter-highlight-selected-word==0.2.0 jupyter-latex-envs==1.4.6 jupyter-nbextensions-configurator==0.4.1 jupyter_client==8.2.0 jupyter_core==5.3.0 jupyterlab-pygments==0.1.2 jupyterlab-widgets==1.0.2 Keras-Applications==1.0.8 kiwisolver==1.3.1 kwarray==0.6.12 lanms-neo==1.0.2 Levenshtein==0.18.1 lit==16.0.0 lmdb==1.3.0 lxml==4.6.3 Markdown==3.4.3 markdown-it-py==2.2.0 MarkupSafe==2.0.1 matplotlib==3.4.2 matplotlib-inline==0.1.2 mccabe==0.7.0 mdurl==0.1.2 mkl-fft==1.3.1 mkl-random @ file:///tmp/build/80754af9/mkl_random_1626186064646/work mkl-service==2.4.0 -e git+https://github.com/open-mmlab/mmcv.git@89a264527e3dc9c5eebed6195faa709d446c7a9c#egg=mmcv mmdet==3.0.0 mmengine==0.7.2 mmlvis==10.5.3 -e git+https://github.com/open-mmlab/mmocr.git@d7c59f3325aaf4cbf6ddd3ec69f03230bc582d19#egg=mmocr mmpycocotools==12.0.3 model-index==0.1.11 mpmath==1.3.0 nbclient==0.5.4 nest-asyncio==1.5.6 networkx==2.5.1 ninja==1.11.1 nltk==3.6.2 numpy @ file:///croot/numpy_and_numpy_base_1682520569166/work nvidia-cublas-cu11==11.10.3.66 nvidia-cuda-cupti-cu11==11.7.101 nvidia-cuda-nvrtc-cu11==11.7.99 nvidia-cuda-runtime-cu11==11.7.99 nvidia-cudnn-cu11==8.5.0.96 nvidia-cufft-cu11==10.9.0.58 nvidia-curand-cu11==10.2.10.91 nvidia-cusolver-cu11==11.4.0.1 nvidia-cusparse-cu11==11.7.4.91 nvidia-nccl-cu11==2.14.3 nvidia-nvtx-cu11==11.7.91 onnx==1.11.0 opencv-python==4.6.0.66 opencv-python-headless==4.1.2.30 openmim==0.3.7 ordered-set==4.1.0 packaging==21.0 pandas==2.0.0 pandocfilters==1.4.3 parameterized==0.9.0 parso==0.8.2 pexpect==4.8.0 pickleshare==0.7.5 Pillow==9.4.0 platformdirs==3.5.0 pluggy==1.0.0 Polygon3==3.0.9.1 prefetch-generator==1.0.3 prettytable==3.2.0 prometheus-client==0.11.0 prompt-toolkit==3.0.20 psutil==5.9.5 ptyprocess==0.7.0 py==1.11.0 pyarrow==6.0.1 pyclipper==1.3.0 pycocotools==2.0.6 pycodestyle==2.10.0 pydantic==1.8.2 pydeck==0.7.1 pyflakes==3.0.1 Pygments==2.14.0 Pympler==1.0.1 pyparsing==2.4.7 pyrsistent==0.18.0 pytest==7.1.1 pytest-cov==3.0.0 pytest-runner==6.0.0 python-bidi==0.4.2 python-dateutil==2.8.2 python-multipart==0.0.5 pytz==2023.3 pytz-deprecation-shim==0.1.0.post0 PyWavelets==1.1.1 PyYAML==5.4.1 pyzmq==25.0.2 rapidfuzz==2.0.7 regex==2021.8.28 requests==2.29.0 rich==13.3.3 sacremoses==0.0.49 scikit-image==0.18.1 scikit-learn==0.24.2 scipy==1.10.1 seaborn==0.11.2 Send2Trash==1.8.0 seqeval==1.2.2 Shapely==1.7.1 SharedArray==3.2.1 six @ file:///tmp/build/80754af9/six_1644875935023/work sklearn==0.0 smmap==5.0.0 soupsieve==2.3.1 starlette==0.14.2 streamlit==1.3.1 sympy==1.11.1 tabulate==0.9.0 termcolor==2.2.0 terminado==0.11.1 terminaltables==3.1.10 testpath==0.5.0 threadpoolctl==2.2.0 tifffile==2021.6.6 tokenizers==0.11.6 toml==0.10.2 tomli==2.0.1 toolz==0.11.2 torch==1.10.2 torchvision==0.11.3 tornado==6.3.1 tqdm==4.65.0 traitlets==5.9.0 transformers==4.17.0 triton==2.0.0 typing_extensions==4.5.0 tzdata==2023.3 tzlocal==4.1 ubelt==1.2.4 urllib3==1.26.15 uvloop==0.16.0 validators==0.18.2 watchdog==2.1.6 wcwidth==0.2.5 webencodings==0.5.1 Werkzeug==2.0.1 widgetsnbextension==3.5.2 xdoctest==1.1.1 yacs==0.1.8 yapf==0.32.0 zipp==3.15.0

mmcv==2.0.0 mmdet==3.0.0 mmocr==1.0.0

Reproduces the problem - code sample

-

Reproduces the problem - command or script

Trying to train model on icdar2015 with all default configs as followed from the documentation site following quick run steps

  1. Download the data

    python tools/dataset_converters/prepare_dataset.py icdar2015 --task textdet
  2. (open-mmlab) mazhards@provenai-2080:~/DBnetpp/mmocr$ python tools/train.py configs/textdet/dbnet/dbnet_resnet50_1200e_icdar2015.py --work-dir dbnet/

    File "/home/mazhards/DBnetpp/mmocr/mmocr/datasets/transforms/wrappers.py", line 202, in _augment_polygons for point in poly: TypeError: 'Polygon' object is not iterable

Reproduces the problem - error message

(open-mmlab) mazhards@provenai-2080:~/DBnetpp/mmocr$ python tools/train.py configs/textdet/dbnet/dbnet_resnet50_1200e_icdar2015.py --work-dir dbnet/ 05/03 18:53:04 - mmengine - INFO -

System environment: sys.platform: linux Python: 3.8.16 (default, Mar 2 2023, 03:21:46) [GCC 11.2.0] CUDA available: True numpy_random_seed: 1720833704 GPU 0: GeForce RTX 2080 Ti CUDA_HOME: /usr/local/cuda-11.0 NVCC: Cuda compilation tools, release 11.0, V11.0.221 GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 PyTorch: 1.10.2 PyTorch compiling details: PyTorch built with:

Runtime environment: cudnn_benchmark: False mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0} dist_cfg: {'backend': 'nccl'} seed: None Distributed launcher: none Distributed training: False GPU number: 1

05/03 18:53:04 - mmengine - INFO - Config: model = dict( type='DBNet', backbone=dict( type='mmdet.ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=-1, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch', init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')), neck=dict( type='FPNC', in_channels=[256, 512, 1024, 2048], lateral_channels=256), det_head=dict( type='DBHead', in_channels=256, module_loss=dict(type='DBModuleLoss'), postprocessor=dict(type='DBPostprocessor', text_repr_type='quad')), data_preprocessor=dict( type='TextDetDataPreprocessor', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], bgr_to_rgb=True, pad_size_divisor=32)) train_pipeline = [ dict(type='LoadImageFromFile', color_type='color_ignore_orientation'), dict( type='LoadOCRAnnotations', with_bbox=True, with_polygon=True, with_label=True), dict( type='TorchVisionWrapper', op='ColorJitter', brightness=0.12549019607843137, saturation=0.5), dict( type='ImgAugWrapper', args=[['Fliplr', 0.5], { 'cls': 'Affine', 'rotate': [-10, 10] }, ['Resize', [0.5, 3.0]]]), dict(type='RandomCrop', min_side_ratio=0.1), dict(type='Resize', scale=(640, 640), keep_ratio=True), dict(type='Pad', size=(640, 640)), dict( type='PackTextDetInputs', meta_keys=('img_path', 'ori_shape', 'img_shape')) ] test_pipeline = [ dict(type='LoadImageFromFile', color_type='color_ignore_orientation'), dict(type='Resize', scale=(4068, 1024), keep_ratio=True), dict( type='LoadOCRAnnotations', with_polygon=True, with_bbox=True, with_label=True), dict( type='PackTextDetInputs', meta_keys=('img_path', 'ori_shape', 'img_shape', 'scale_factor')) ] icdar2015_textdet_data_root = 'data/totaltext' icdar2015_textdet_train = dict( type='OCRDataset', data_root='data/totaltext', ann_file='textdet_train.json', filter_cfg=dict(filter_empty_gt=True, min_size=32), pipeline=[ dict(type='LoadImageFromFile', color_type='color_ignore_orientation'), dict( type='LoadOCRAnnotations', with_bbox=True, with_polygon=True, with_label=True), dict( type='TorchVisionWrapper', op='ColorJitter', brightness=0.12549019607843137, saturation=0.5), dict( type='ImgAugWrapper', args=[['Fliplr', 0.5], { 'cls': 'Affine', 'rotate': [-10, 10] }, ['Resize', [0.5, 3.0]]]), dict(type='RandomCrop', min_side_ratio=0.1), dict(type='Resize', scale=(640, 640), keep_ratio=True), dict(type='Pad', size=(640, 640)), dict( type='PackTextDetInputs', meta_keys=('img_path', 'ori_shape', 'img_shape')) ]) icdar2015_textdet_test = dict( type='OCRDataset', data_root='data/totaltext', ann_file='textdet_test.json', test_mode=False, pipeline=[ dict(type='LoadImageFromFile', color_type='color_ignore_orientation'), dict(type='Resize', scale=(4068, 1024), keep_ratio=True), dict( type='LoadOCRAnnotations', with_polygon=True, with_bbox=True, with_label=True), dict( type='PackTextDetInputs', meta_keys=('img_path', 'ori_shape', 'img_shape', 'scale_factor')) ]) default_scope = 'mmocr' env_cfg = dict( cudnn_benchmark=False, mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0), dist_cfg=dict(backend='nccl')) randomness = dict(seed=None) default_hooks = dict( timer=dict(type='IterTimerHook'), logger=dict(type='LoggerHook', interval=5), param_scheduler=dict(type='ParamSchedulerHook'), checkpoint=dict(type='CheckpointHook', interval=20), sampler_seed=dict(type='DistSamplerSeedHook'), sync_buffer=dict(type='SyncBuffersHook'), visualization=dict( type='VisualizationHook', interval=1, enable=False, show=False, draw_gt=False, draw_pred=False)) log_level = 'INFO' log_processor = dict(type='LogProcessor', window_size=10, by_epoch=True) load_from = None resume = False val_evaluator = dict(type='HmeanIOUMetric') test_evaluator = dict(type='HmeanIOUMetric') vis_backends = [dict(type='LocalVisBackend')] visualizer = dict( type='TextDetLocalVisualizer', name='visualizer', vis_backends=[dict(type='LocalVisBackend')]) optim_wrapper = dict( type='OptimWrapper', optimizer=dict(type='SGD', lr=0.002, momentum=0.9, weight_decay=0.0001)) train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=1200, val_interval=20) val_cfg = dict(type='ValLoop') test_cfg = dict(type='TestLoop') param_scheduler = [ dict(type='LinearLR', end=100, start_factor=0.001), dict(type='PolyLR', power=0.9, eta_min=1e-07, begin=100, end=1200) ] train_dataloader = dict( batch_size=1, num_workers=24, persistent_workers=True, sampler=dict(type='DefaultSampler', shuffle=True), dataset=dict( type='OCRDataset', data_root='data/totaltext', ann_file='textdet_train.json', filter_cfg=dict(filter_empty_gt=True, min_size=32), pipeline=[ dict( type='LoadImageFromFile', color_type='color_ignore_orientation'), dict( type='LoadOCRAnnotations', with_bbox=True, with_polygon=True, with_label=True), dict( type='TorchVisionWrapper', op='ColorJitter', brightness=0.12549019607843137, saturation=0.5), dict( type='ImgAugWrapper', args=[['Fliplr', 0.5], { 'cls': 'Affine', 'rotate': [-10, 10] }, ['Resize', [0.5, 3.0]]]), dict(type='RandomCrop', min_side_ratio=0.1), dict(type='Resize', scale=(640, 640), keep_ratio=True), dict(type='Pad', size=(640, 640)), dict( type='PackTextDetInputs', meta_keys=('img_path', 'ori_shape', 'img_shape')) ])) val_dataloader = dict( batch_size=1, num_workers=4, persistent_workers=True, sampler=dict(type='DefaultSampler', shuffle=False), dataset=dict( type='OCRDataset', data_root='data/totaltext', ann_file='textdet_test.json', test_mode=False, pipeline=[ dict( type='LoadImageFromFile', color_type='color_ignore_orientation'), dict(type='Resize', scale=(4068, 1024), keep_ratio=True), dict( type='LoadOCRAnnotations', with_polygon=True, with_bbox=True, with_label=True), dict( type='PackTextDetInputs', meta_keys=('img_path', 'ori_shape', 'img_shape', 'scale_factor')) ])) test_dataloader = dict( batch_size=1, num_workers=4, persistent_workers=True, sampler=dict(type='DefaultSampler', shuffle=False), dataset=dict( type='OCRDataset', data_root='data/totaltext', ann_file='textdet_test.json', test_mode=False, pipeline=[ dict( type='LoadImageFromFile', color_type='color_ignore_orientation'), dict(type='Resize', scale=(4068, 1024), keep_ratio=True), dict( type='LoadOCRAnnotations', with_polygon=True, with_bbox=True, with_label=True), dict( type='PackTextDetInputs', meta_keys=('img_path', 'ori_shape', 'img_shape', 'scale_factor')) ])) auto_scale_lr = dict(base_batch_size=16) launcher = 'none' work_dir = 'dbnet/'

/home/mazhards/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmdet/evaluation/metrics/lvis_metric.py:22: UserWarning: mmlvis is deprecated, please install official lvis-api by "pip install git+https://github.com/lvis-dataset/lvis-api.git" warnings.warn( 05/03 18:53:08 - mmengine - INFO - Distributed training is not used, all SyncBatchNorm (SyncBN) layers in the model will be automatically reverted to BatchNormXd layers if they are used. 05/03 18:53:08 - mmengine - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) RuntimeInfoHook
(BELOW_NORMAL) LoggerHook


before_train: (VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(VERY_LOW ) CheckpointHook


before_train_epoch: (VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(NORMAL ) DistSamplerSeedHook


before_train_iter: (VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook


after_train_iter: (VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook
(LOW ) ParamSchedulerHook
(VERY_LOW ) CheckpointHook


after_train_epoch: (NORMAL ) IterTimerHook
(NORMAL ) SyncBuffersHook
(LOW ) ParamSchedulerHook
(VERY_LOW ) CheckpointHook


before_val_epoch: (NORMAL ) IterTimerHook


before_val_iter: (NORMAL ) IterTimerHook


after_val_iter: (NORMAL ) IterTimerHook
(NORMAL ) VisualizationHook
(BELOW_NORMAL) LoggerHook


after_val_epoch: (VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook
(LOW ) ParamSchedulerHook
(VERY_LOW ) CheckpointHook


after_train: (VERY_LOW ) CheckpointHook


before_test_epoch: (NORMAL ) IterTimerHook


before_test_iter: (NORMAL ) IterTimerHook


after_test_iter: (NORMAL ) IterTimerHook
(NORMAL ) VisualizationHook
(BELOW_NORMAL) LoggerHook


after_test_epoch: (VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook


after_run: (BELOW_NORMAL) LoggerHook


/home/mazhards/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py:478: UserWarning: This DataLoader will create 24 worker processes in total. Our suggested max number of worker in current system is 20, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary. warnings.warn(_create_warning_msg( 05/03 18:53:09 - mmengine - INFO - load model from: torchvision://resnet50 05/03 18:53:09 - mmengine - INFO - Loads checkpoint by torchvision backend from path: torchvision://resnet50 05/03 18:53:09 - mmengine - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: fc.weight, fc.bias

05/03 18:53:09 - mmengine - WARNING - "FileClient" will be deprecated in future. Please use io functions in https://mmengine.readthedocs.io/en/latest/api/fileio.html#file-io 05/03 18:53:09 - mmengine - WARNING - "HardDiskBackend" is the alias of "LocalBackend" and the former will be deprecated in future. 05/03 18:53:09 - mmengine - INFO - Checkpoints will be saved to /home/mazhards/DBnetpp/mmocr/dbnet. /home/mazhards/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py:478: UserWarning: This DataLoader will create 24 worker processes in total. Our suggested max number of worker in current system is 20, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary. warnings.warn(_create_warning_msg( Traceback (most recent call last): File "tools/train.py", line 115, in main() File "tools/train.py", line 111, in main runner.train() File "/home/mazhards/.local/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1706, in train model = self.train_loop.run() # type: ignore File "/home/mazhards/.local/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run self.run_epoch() File "/home/mazhards/.local/lib/python3.8/site-packages/mmengine/runner/loops.py", line 111, in run_epoch for idx, data_batch in enumerate(self.dataloader): File "/home/mazhards/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in next data = self._next_data() File "/home/mazhards/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data return self._process_data(data) File "/home/mazhards/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data data.reraise() File "/home/mazhards/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/_utils.py", line 434, in reraise raise exception TypeError: Caught TypeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/mazhards/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/home/mazhards/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/mazhards/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/mazhards/.local/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 413, in getitem data = self.prepare_data(idx) File "/home/mazhards/.local/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 797, in prepare_data return self.pipeline(data_info) File "/home/mazhards/.local/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 59, in call data = t(data) File "/home/mazhards/DBnetpp/mmcv/mmcv/transforms/base.py", line 12, in call return self.transform(results) File "/home/mazhards/DBnetpp/mmocr/mmocr/datasets/transforms/wrappers.py", line 88, in transform if not self._augment_annotations(aug, ori_shape, results): File "/home/mazhards/DBnetpp/mmocr/mmocr/datasets/transforms/wrappers.py", line 117, in _augment_annotations transformed_polygons, removed_poly_inds = self._augment_polygons( File "/home/mazhards/DBnetpp/mmocr/mmocr/datasets/transforms/wrappers.py", line 202, in _augment_polygons for point in poly: TypeError: 'Polygon' object is not iterable

Additional information

The cause of error is perhaps some package version issue, (imgaug) Error arises at _augment_polygons function in mmocr/datasets/transforms/wrappers.py

           for point in poly:
                new_poly.append(np.array(point, dtype=np.float32))

apparently poly object is not an iterable

printing poly object before reaching this point below

print("poly", poly) print('type poly', type(poly))

Polygon([(x=892.600, y=1668.360), (x=900.481, y=2001.301), (x=1020.032, y=1998.470), (x=1139.583, y=1995.639), (x=1131.701, y=1662.698), (x=1012.151, y=1665.529)] (6 points), label=None) 'Polygon' object is not iterable type poly <class 'imgaug.augmentables.polys.Polygon'>

Mountchicken commented 1 year ago

Hi @mazharsaif Upgrade imgaug to 0.4.0 version will solve this problem.

pip install imgaug==0.4.0