open-mmlab / mmocr

OpenMMLab Text Detection, Recognition and Understanding Toolbox
https://mmocr.readthedocs.io/en/dev-1.x/
Apache License 2.0
4.3k stars 745 forks source link

[Bug] Use KIELocalVisualizer,I get an error #2038

Open Dark1Forest opened 5 months ago

Dark1Forest commented 5 months ago

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmocr

Environment

Package Version


addict 2.4.0 asynctest 0.13.0 attrs 23.2.0 boto3 1.34.79 botocore 1.34.79 certifi 2024.2.2 charset-normalizer 3.3.2 clearml 1.15.0 click 8.1.7 codecov 2.1.13 colorama 0.4.6 contourpy 1.2.1 coverage 7.4.4 cycler 0.12.1 exceptiongroup 1.2.0 flake8 7.0.0 fonttools 4.51.0 furl 2.1.3 idna 3.6 imageio 2.34.0 imgaug 0.4.0 importlib_metadata 7.1.0 iniconfig 2.0.0 interrogate 1.5.0 isort 5.13.2 jmespath 1.0.1 jsonschema 4.21.1 jsonschema-specifications 2023.12.1 kiwisolver 1.4.5 kwarray 0.6.18 lanms_neo 1.0.2 lazy_loader 0.4 lmdb 1.4.1 markdown-it-py 3.0.0 matplotlib 3.8.4 mccabe 0.7.0 mdurl 0.1.2 mmcv 2.0.1 mmdet 3.1.0 mmengine 0.10.3 mmocr 1.0.1 networkx 3.3 numpy 1.26.4 opencv-python 4.9.0.80 orderedmultidict 1.0.1 packaging 24.0 parameterized 0.9.0 pathlib2 2.3.7.post1 pillow 10.3.0 pip 23.3.1 platformdirs 4.2.0 pluggy 1.4.0 psutil 5.9.8 py 1.11.0 pyclipper 1.3.0.post5 pycocotools 2.0.7 pycodestyle 2.11.1 pyflakes 3.2.0 Pygments 2.17.2 PyJWT 2.8.0 pyparsing 3.1.2 pytest 8.1.1 pytest-cov 5.0.0 pytest-runner 6.0.1 python-dateutil 2.9.0.post0 PyYAML 6.0.1 rapidfuzz 3.8.0 referencing 0.34.0 requests 2.31.0 rich 13.7.1 rpds-py 0.18.0 s3transfer 0.10.1 scikit-image 0.22.0 scipy 1.13.0 setuptools 68.2.2 shapely 2.0.3 six 1.16.0 tabulate 0.9.0 termcolor 2.4.0 terminaltables 3.1.10 tifffile 2024.2.12 toml 0.10.2 tomli 2.0.1 torch 1.13.1+cu117 torchaudio 0.13.1+cu117 torchvision 0.14.1+cu117 tqdm 4.66.2 typing_extensions 4.11.0 ubelt 1.3.5 urllib3 2.2.1 wheel 0.41.2 xdoctest 1.1.3 yapf 0.40.2 zipp 3.18.1

Reproduces the problem - code sample

_base_ = [
    '../_base_/default_runtime.py',
    # '../_base_/datasets/wildreceipt-openset.py',
    # '../_base_/schedules/schedule_adam_60e.py',
    # '_base_sdmgr_novisual.py',
    '_base_sdmgr_unet16.py',
]

node_num_classes = 5  # 4 classes: bg, key, value and other
edge_num_classes = 2  # edge connectivity
key_node_idx = 1
value_node_idx = 2
model = dict(
    backbone=dict(type='UNet', base_channels=16),
    roi_extractor=dict(
        type='mmdet.SingleRoIExtractor',
        roi_layer=dict(type='RoIAlign', output_size=7),
        featmap_strides=[1]),
    data_preprocessor=dict(
        type='ImgDataPreprocessor',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        bgr_to_rgb=True,
        pad_size_divisor=32),
)

train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadKIEAnnotations'),
    dict(type='Resize', scale=(512, 1024), keep_ratio=True),
    dict(type='PackKIEInputs')
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadKIEAnnotations'),
    dict(type='Resize', scale=(512, 1024), keep_ratio=True),
    dict(type='PackKIEInputs', meta_keys=('img_path', )),
]

wildreceipt_openset_train = dict(
    type='WildReceiptDataset',
    metainfo=dict(category=[
        dict(id=0, name='relic_entry'),
        dict(id=1, name='relic_entry_main'),
        dict(id=2, name='relic_level'),
        dict(id=3, name='relic_parts'),
        dict(id=3, name='relic_suit')
    ]),
    ann_file='/data/dataset/mmdetection/HonkaiKIELabel/result_w.json',
    pipeline=train_pipeline)

wildreceipt_openset_test = dict(
    type='WildReceiptDataset',
    metainfo=dict(category=[
        dict(id=0, name='relic_entry'),
        dict(id=1, name='relic_entry_main'),
        dict(id=2, name='relic_level'),
        dict(id=3, name='relic_parts'),
        dict(id=3, name='relic_suit')
    ]),
    ann_file='/data/dataset/mmdetection/HonkaiKIELabel/result_w.json',
    test_mode=True,
    pipeline=test_pipeline)

optim_wrapper = dict(
    type='OptimWrapper', optimizer=dict(type='Adam', weight_decay=0.0001))
train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=60, val_interval=1)
val_cfg = dict(type='ValLoop')
test_cfg = dict(type='TestLoop')
# learning rate
param_scheduler = [
    dict(type='MultiStepLR', milestones=[40, 50], end=60),
]

train_dataloader = dict(
    batch_size=4,
    num_workers=1,
    persistent_workers=True,
    sampler=dict(type='DefaultSampler', shuffle=True),
    dataset=wildreceipt_openset_train)
val_dataloader = dict(
    batch_size=1,
    num_workers=1,
    persistent_workers=True,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=wildreceipt_openset_test)
test_dataloader = val_dataloader

# test_evaluator = val_evaluator
auto_scale_lr = dict(base_batch_size=4)

default_hooks = dict(
    timer=dict(type='IterTimerHook'),
    logger=dict(type='LoggerHook', interval=100),
    param_scheduler=dict(type='ParamSchedulerHook'),
    checkpoint=dict(type='CheckpointHook', interval=5),
    # checkpoint=dict(interval=5, max_keep_ckpts=3, save_best='auto', type='CheckpointHook'),
    sampler_seed=dict(type='DistSamplerSeedHook'),
    sync_buffer=dict(type='SyncBuffersHook'),
    visualization=dict(
        type='VisualizationHook',
        interval=5,
        enable=True,
        show=False,
        draw_gt=True,
        draw_pred=True,
    ),
)

clear_ml_init = dict(
    reuse_last_task_id=False,
    continue_last_task=False,
)
vis_backends = [
    dict(type="ClearMLVisBackend", init_kwargs=clear_ml_init)
]

visualizer = dict(
    type='KIELocalVisualizer',
    # type='TextSpottingLocalVisualizer',
    name='visualizer',
    vis_backends=vis_backends,
    is_openset=True,

    save_dir="/mnt/project/mmocr/work_dirs"
)

Reproduces the problem - command or script

CUDA_VISIBLE_DEVICES=0 python tools/train.py configs/kie/sdmgr/sdmgr_novisual_60e_wildreceipt-openset_honkai.py --amp

Reproduces the problem - error message

Traceback (most recent call last): File "/mnt/project/mmocr/tools/train.py", line 114, in main() File "/mnt/project/mmocr/tools/train.py", line 110, in main runner.train() File "/data/conda/envs/mmocr/lib/python3.10/site-packages/mmengine/runner/runner.py", line 1777, in train model = self.train_loop.run() # type: ignore File "/data/conda/envs/mmocr/lib/python3.10/site-packages/mmengine/runner/loops.py", line 102, in run self.runner.val_loop.run() File "/data/conda/envs/mmocr/lib/python3.10/site-packages/mmengine/runner/loops.py", line 371, in run self.run_iter(idx, data_batch) File "/data/conda/envs/mmocr/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/data/conda/envs/mmocr/lib/python3.10/site-packages/mmengine/runner/loops.py", line 394, in run_iter self.runner.call_hook( File "/data/conda/envs/mmocr/lib/python3.10/site-packages/mmengine/runner/runner.py", line 1841, in call_hook raise TypeError(f'{e} in {hook}') from None TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. in <mmocr.engine.hooks.visualization_hook.VisualizationHook object at 0x7f01f55f6080>

Additional information

No response