open-mmlab / mmpose

OpenMMLab Pose Estimation Toolbox and Benchmark.
https://mmpose.readthedocs.io/en/latest/
Apache License 2.0
5.84k stars 1.25k forks source link

When train model method bottom up KeyError : 'inputs'[Bug] #2663

Open seon-creator opened 1 year ago

seon-creator commented 1 year ago

Prerequisite

Environment

I want to compare the accuracy of keypoint inference between bottom up and top down. I find the model in bottom up root is : mmpose/configs/body/2d_kpt_sview_rgb_img/associative_embedding

I'm using latest version of mmpose

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             5.1                       1_gnu  
addict                    2.4.0                    pypi_0    pypi
albumentations            1.3.1                    pypi_0    pypi
attrs                     23.1.0                   pypi_0    pypi
ca-certificates           2023.05.30           h06a4308_0  
certifi                   2022.12.7                pypi_0    pypi
charset-normalizer        2.1.1                    pypi_0    pypi
chumpy                    0.70                     pypi_0    pypi
click                     8.1.4                    pypi_0    pypi
cmake                     3.25.0                   pypi_0    pypi
colorama                  0.4.6                    pypi_0    pypi
contourpy                 1.1.0                    pypi_0    pypi
coverage                  7.2.7                    pypi_0    pypi
cycler                    0.11.0                   pypi_0    pypi
cython                    0.29.36                  pypi_0    pypi
einops                    0.6.1                    pypi_0    pypi
exceptiongroup            1.1.2                    pypi_0    pypi
filelock                  3.9.0                    pypi_0    pypi
flake8                    6.0.0                    pypi_0    pypi
fonttools                 4.40.0                   pypi_0    pypi
idna                      3.4                      pypi_0    pypi
imageio                   2.31.1                   pypi_0    pypi
importlib-metadata        6.7.0                    pypi_0    pypi
importlib-resources       5.12.0                   pypi_0    pypi
iniconfig                 2.0.0                    pypi_0    pypi
interrogate               1.5.0                    pypi_0    pypi
isort                     4.3.21                   pypi_0    pypi
jinja2                    3.1.2                    pypi_0    pypi
joblib                    1.3.2                    pypi_0    pypi
json-tricks               3.17.1                   pypi_0    pypi
kiwisolver                1.4.4                    pypi_0    pypi
lazy-loader               0.3                      pypi_0    pypi
ld_impl_linux-64          2.38                 h1181459_1  
libffi                    3.4.4                h6a678d5_0  
libgcc-ng                 11.2.0               h1234567_1  
libgomp                   11.2.0               h1234567_1  
libstdcxx-ng              11.2.0               h1234567_1  
lit                       15.0.7                   pypi_0    pypi
markdown                  3.4.3                    pypi_0    pypi
markdown-it-py            3.0.0                    pypi_0    pypi
markupsafe                2.1.2                    pypi_0    pypi
mat4py                    0.5.0                    pypi_0    pypi
matplotlib                3.7.2                    pypi_0    pypi
mccabe                    0.7.0                    pypi_0    pypi
mdurl                     0.1.2                    pypi_0    pypi
mmcls                     1.0.0rc6                 pypi_0    pypi
mmcv                      2.0.1                    pypi_0    pypi
mmdet                     3.1.0                    pypi_0    pypi
mmengine                  0.8.4                    pypi_0    pypi
mmpose                    1.1.0                     dev_0    <develop>
mmpretrain                1.0.1                    pypi_0    pypi
model-index               0.1.11                   pypi_0    pypi
modelindex                0.0.2                    pypi_0    pypi
mpmath                    1.2.1                    pypi_0    pypi
munkres                   1.1.4                    pypi_0    pypi
ncurses                   6.4                  h6a678d5_0  
networkx                  3.0                      pypi_0    pypi
numpy                     1.24.1                   pypi_0    pypi
opencv-python             4.8.0.74                 pypi_0    pypi
opencv-python-headless    4.8.0.76                 pypi_0    pypi
opendatalab               0.0.9                    pypi_0    pypi
openmim                   0.3.9                    pypi_0    pypi
openssl                   3.0.9                h7f8727e_0  
ordered-set               4.1.0                    pypi_0    pypi
packaging                 23.1                     pypi_0    pypi
pandas                    2.0.3                    pypi_0    pypi
parameterized             0.9.0                    pypi_0    pypi
pillow                    9.3.0                    pypi_0    pypi
pip                       23.1.2           py39h06a4308_0  
platformdirs              3.8.1                    pypi_0    pypi
pluggy                    1.2.0                    pypi_0    pypi
py                        1.11.0                   pypi_0    pypi
pycocotools               2.0.6                    pypi_0    pypi
pycodestyle               2.10.0                   pypi_0    pypi
pycryptodome              3.18.0                   pypi_0    pypi
pyflakes                  3.0.1                    pypi_0    pypi
pygments                  2.15.1                   pypi_0    pypi
pyparsing                 3.0.9                    pypi_0    pypi
pytest                    7.4.0                    pypi_0    pypi
pytest-runner             6.0.0                    pypi_0    pypi
python                    3.9.17               h955ad1f_0  
python-dateutil           2.8.2                    pypi_0    pypi
pytz                      2023.3                   pypi_0    pypi
pywavelets                1.4.1                    pypi_0    pypi
pyyaml                    6.0                      pypi_0    pypi
qudida                    0.0.4                    pypi_0    pypi
readline                  8.2                  h5eee18b_0  
requests                  2.28.1                   pypi_0    pypi
rich                      13.4.2                   pypi_0    pypi
scikit-image              0.21.0                   pypi_0    pypi
scikit-learn              1.3.0                    pypi_0    pypi
scipy                     1.11.1                   pypi_0    pypi
seaborn                   0.12.2                   pypi_0    pypi
setuptools                67.8.0           py39h06a4308_0  
shapely                   2.0.1                    pypi_0    pypi
six                       1.16.0                   pypi_0    pypi
sqlite                    3.41.2               h5eee18b_0  
sympy                     1.11.1                   pypi_0    pypi
tabulate                  0.9.0                    pypi_0    pypi
termcolor                 2.3.0                    pypi_0    pypi
terminaltables            3.1.10                   pypi_0    pypi
threadpoolctl             3.2.0                    pypi_0    pypi
tifffile                  2023.8.12                pypi_0    pypi
tk                        8.6.12               h1ccaba5_0  
toml                      0.10.2                   pypi_0    pypi
tomli                     2.0.1                    pypi_0    pypi
torch                     2.0.1+cu118              pypi_0    pypi
torchaudio                2.0.2+cu118              pypi_0    pypi
torchvision               0.15.2+cu118             pypi_0    pypi
tqdm                      4.65.0                   pypi_0    pypi
triton                    2.0.0                    pypi_0    pypi
typing-extensions         4.4.0                    pypi_0    pypi
tzdata                    2023.3                   pypi_0    pypi
urllib3                   1.26.13                  pypi_0    pypi
wheel                     0.38.4           py39h06a4308_0  
xdoctest                  1.1.1                    pypi_0    pypi
xtcocotools               1.13                     pypi_0    pypi
xz                        5.4.2                h5eee18b_0  
yapf                      0.40.1                   pypi_0    pypi
zipp                      3.15.0                   pypi_0    pypi
zlib                      1.2.13               h5eee18b_0  

Reproduces the problem - code sample

_base_ = ['../../../_base_/default_runtime.py']

# runtime
train_cfg = dict(max_epochs=300, val_interval=10)

# optimizer
optim_wrapper = dict(optimizer=dict(
    type='Adam',
    lr=1.5e-3,
))

# learning policy
param_scheduler = [
    dict(
        type='LinearLR', begin=0, end=500, start_factor=0.001,
        by_epoch=False),  # warm-up
    dict(
        type='MultiStepLR',
        begin=0,
        end=300,
        milestones=[200, 260],
        gamma=0.1,
        by_epoch=True)
]

# automatically scaling LR based on the actual training batch size
auto_scale_lr = dict(base_batch_size=192)

# hooks
default_hooks = dict(
    checkpoint=dict(save_best='coco/AP', rule='greater', interval=50))

# codec settings
codec = dict(
    type='AssociativeEmbedding',
    input_size=(512, 512),
    heatmap_size=(128, 128),
    sigma=2,
    decode_keypoint_order=[ 
        0, 1, 2, 3, 4 # edit
    ],
    decode_max_instances=30)

# model settings
model = dict(
    type='BottomupPoseEstimator',
    data_preprocessor=dict(
        type='PoseDataPreprocessor',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        bgr_to_rgb=True),
    backbone=dict(
        type='HRNet',
        in_channels=3,
        extra=dict(
            stage1=dict(
                num_modules=1,
                num_branches=1,
                block='BOTTLENECK',
                num_blocks=(4, ),
                num_channels=(64, )),
            stage2=dict(
                num_modules=1,
                num_branches=2,
                block='BASIC',
                num_blocks=(4, 4),
                num_channels=(32, 64)),
            stage3=dict(
                num_modules=4,
                num_branches=3,
                block='BASIC',
                num_blocks=(4, 4, 4),
                num_channels=(32, 64, 128)),
            stage4=dict(
                num_modules=3,
                num_branches=4,
                block='BASIC',
                num_blocks=(4, 4, 4, 4),
                num_channels=(32, 64, 128, 256))),
        init_cfg=dict(
            type='Pretrained',
            checkpoint='/data/home/seondeok/mmpose/configs/body_2d_keypoint/associative_embedding/custom_model/td-hm_hrnet-w32_8xb64-210e_coco-256x192-81c58e40_20220909.pth'),
    ),
    head=dict(
        type='AssociativeEmbeddingHead',
        in_channels=32,
        num_keypoints=5, # edit
        tag_dim=1,
        tag_per_keypoint=True,
        deconv_out_channels=None,
        keypoint_loss=dict(type='KeypointMSELoss', use_target_weight=True),
        tag_loss=dict(type='AssociativeEmbeddingLoss', loss_weight=0.001),
        # The heatmap will be resized to the input size before decoding
        # if ``restore_heatmap_size==True``
        decoder=dict(codec, heatmap_size=codec['input_size'])),
    test_cfg=dict(
        multiscale_test=False,
        flip_test=True,
        shift_heatmap=True,
        restore_heatmap_size=True,
        align_corners=False))

# base dataset settings
dataset_type = 'CocoArm'
data_mode = 'bottomup'
data_root = 'data/coco/'

# pipelines
train_pipeline = []
val_pipeline = [
    dict(type='LoadImage'),
    dict(
        type='BottomupResize',
        input_size=codec['input_size'],
        size_factor=32,
        resize_mode='expand'),
    dict(type='PackPoseInputs')
]

# data loaders
train_dataloader = dict(
    batch_size=8,
    num_workers=2,
    persistent_workers=True,
    sampler=dict(type='DefaultSampler', shuffle=True),
    dataset=dict(
        type=dataset_type,
        data_root=data_root,
        data_mode=data_mode,
        ann_file='/data/home/seondeok/mmpose/datasets/DS_11_89/DS_arm_1_80.json',    # edit
        data_prefix=dict(img='/data/home/seondeok/mmpose/datasets/DS_11_89/'),     # edit
        pipeline=train_pipeline,
    ))
val_dataloader = dict(
    batch_size=1,
    num_workers=2,
    persistent_workers=True,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False, round_up=False),
    dataset=dict(
        type=dataset_type,
        data_root=data_root,
        data_mode=data_mode,
        ann_file='/data/home/seondeok/mmpose/datasets/DS_1_10_90_100/DS_arm_81_100.json',     # edit
        data_prefix=dict(img='/data/home/seondeok/mmpose/datasets/DS_1_10_90_100/'),      # edit
        test_mode=True,
        pipeline=val_pipeline,
    ))
test_dataloader = val_dataloader

# evaluators
val_evaluator = dict(
    type='CocoMetric',
    ann_file='/data/home/seondeok/mmpose/datasets/DS_1_10_90_100/DS_arm_81_100.json',   # edit
    nms_mode='none',
    score_mode='keypoint',
)
test_evaluator = val_evaluator

Reproduces the problem - command or script

In mmpose directory.

python tools/train.py configs/body_2d_keypoint/associative_embedding/custom_model/ae_hrnet-w32_8xb24-300e_coco-512x512.py

Reproduces the problem - error message

Traceback (most recent call last):
  File "/data/home/seondeok/mmpose/tools/train.py", line 161, in <module>
    main()
  File "/data/home/seondeok/mmpose/tools/train.py", line 157, in main
    runner.train()
  File "/data/home/seondeok/.conda/envs/mmp/lib/python3.9/site-packages/mmengine/runner/runner.py", line 1745, in train
    model = self.train_loop.run()  # type: ignore
  File "/data/home/seondeok/.conda/envs/mmp/lib/python3.9/site-packages/mmengine/runner/loops.py", line 96, in run
    self.run_epoch()
  File "/data/home/seondeok/.conda/envs/mmp/lib/python3.9/site-packages/mmengine/runner/loops.py", line 112, in run_epoch
    self.run_iter(idx, data_batch)
  File "/data/home/seondeok/.conda/envs/mmp/lib/python3.9/site-packages/mmengine/runner/loops.py", line 128, in run_iter
    outputs = self.runner.model.train_step(
  File "/data/home/seondeok/.conda/envs/mmp/lib/python3.9/site-packages/mmengine/model/base_model/base_model.py", line 113, in train_step
    data = self.data_preprocessor(data, True)
  File "/data/home/seondeok/.conda/envs/mmp/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/home/seondeok/.conda/envs/mmp/lib/python3.9/site-packages/mmengine/model/base_model/data_preprocessor.py", line 247, in forward
    _batch_inputs = data['inputs']
KeyError: 'inputs'

Additional information

  1. When I use this custom data before mmpose update, It is works well in top down and bottom up. (I think that version maybe didn't use mmengine.)
  2. I'm using my custom coco data. Like this
{
    "images": [
        {
            "id": 12351,
            "dataset_id": 27,
            "category_ids": [],
            "path": "/datasets/DS_arm_81_100/101_1_0.png",
            "width": 1488,
            "height": 837,
            "file_name": "101_1_0.png",
            "annotated": false,
            "annotating": [],
            "num_annotations": 0,
            "metadata": {},
            "deleted": false,
            "milliseconds": 0,
            "events": [],
            "regenerate_thumbnail": false
        },
}
    "categories": [
        {
            "id": 9,
            "name": "armpoint",
            "supercategory": "arm",
            "color": "#3bcb02",
            "metadata": {},
            "keypoint_colors": [
                "#bf5c4d",
                "#d99100",
                "#4d8068",
                "#0d2b80",
                "#9c73bf"
            ],
            "keypoints": [
                "li11",
                "li10",
                "te5",
                "li4",
                "te3"
            ],
            "skeleton": []
        }
    ],
    "annotations": [
        {
            "id": 11524,
            "image_id": 12351,
            "category_id": 9,
            "segmentation": [
                [
                    1254.9,
                    21.7,
                    1254.9,
                    608.8,
                    157.9,
                    608.8,
                    157.9,
                    21.7
                ]
            ],
            "area": 643939,
            "bbox": [
                158,
                22,
                1097,
                587
            ],
            "iscrowd": false,
            "isbbox": true,
            "color": "#20a8cf",
            "keypoints": [
                268,
                172,
                2,
                369,
                195,
                2,
                753,
                333,
                2,
                994,
                314,
                2,
                933,
                447,
                2
            ],
            "metadata": {},
            "num_keypoints": 5
        },
Ben-Louis commented 1 year ago

Thanks for using MMPose. Associative Embedding is still under migration. Maybe you can use other methods such as DEKR

seon-creator commented 1 year ago

Thank you I'm trying to use DEKR

I tried some times to train that model, but error occur How to solve it ?

Traceback (most recent call last):
  File "tools/train.py", line 161, in <module>
    main()
  File "tools/train.py", line 157, in main
    runner.train()
  File "/data/home/seondeok/.conda/envs/btmmpose/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1745, in train
    model = self.train_loop.run()  # type: ignore
  File "/data/home/seondeok/.conda/envs/btmmpose/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run
    self.run_epoch()
  File "/data/home/seondeok/.conda/envs/btmmpose/lib/python3.8/site-packages/mmengine/runner/loops.py", line 111, in run_epoch
    for idx, data_batch in enumerate(self.dataloader):
  File "/data/home/seondeok/.conda/envs/btmmpose/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 633, in __next__
    data = self._next_data()
  File "/data/home/seondeok/.conda/envs/btmmpose/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1345, in _next_data
    return self._process_data(data)
  File "/data/home/seondeok/.conda/envs/btmmpose/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1371, in _process_data
    data.reraise()
  File "/data/home/seondeok/.conda/envs/btmmpose/lib/python3.8/site-packages/torch/_utils.py", line 644, in reraise
    raise exception
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/data/home/seondeok/.conda/envs/btmmpose/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "/data/home/seondeok/.conda/envs/btmmpose/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/data/home/seondeok/.conda/envs/btmmpose/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/data/home/seondeok/.conda/envs/btmmpose/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 408, in __getitem__
    data = self.prepare_data(idx)
  File "/data/home/seondeok/.conda/envs/btmmpose/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 113, in wrapper
    return old_func(obj, *args, **kwargs)
  File "/data/home/seondeok/mmpose/mmpose/datasets/datasets/base/base_coco_style_dataset.py", line 150, in prepare_data
    return self.pipeline(data_info)
  File "/data/home/seondeok/.conda/envs/btmmpose/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 58, in __call__
    data = t(data)
  File "/data/home/seondeok/.conda/envs/btmmpose/lib/python3.8/site-packages/mmcv/transforms/base.py", line 12, in __call__
    return self.transform(results)
  File "/data/home/seondeok/mmpose/mmpose/datasets/transforms/bottomup_transforms.py", line 89, in transform
    mask = 1 - self._segs_to_mask(invalid_segs, img_shape)
  File "/data/home/seondeok/mmpose/mmpose/datasets/transforms/bottomup_transforms.py", line 53, in _segs_to_mask
    rle = cocomask.frPyObjects(seg, img_shape[0], img_shape[1])
  File "xtcocotools/_mask.pyx", line 292, in xtcocotools._mask.frPyObjects
IndexError: list index out of range

It is occur while training.

08/30 18:14:43 - mmengine - WARNING - "FileClient" will be deprecated in future. Please use io functions in https://mmengine.readthedocs.io/en/latest/api/fileio.html#file-io
08/30 18:14:43 - mmengine - WARNING - "HardDiskBackend" is the alias of "LocalBackend" and the former will be deprecated in future.
08/30 18:14:43 - mmengine - INFO - Checkpoints will be saved to /data/home/seondeok/mmpose/work_dirs/Bottom_up/DEKR_hrnet_w32.
08/30 18:15:07 - mmengine - INFO - Epoch(train)   [1][ 50/106]  lr: 9.909820e-05  eta: 2:48:52  time: 0.479100  data_time: 0.027063  memory: 7743  loss: 0.001327  loss/heatmap: 0.000875  loss/displacement: 0.000452
Ben-Louis commented 1 year ago

Apologies for the delayed response. Has the problem been resolved?

seon-creator commented 1 year ago

Thank you for check it.

Now I can train DEKR model but, when evaluation it has problem. All of the AP value are zero.

Loading and preparing results... DONE (t=0.01s) creating index... index created! Running per image evaluation... Evaluate annotation type keypoints DONE (t=0.02s). Accumulating evaluation results... DONE (t=0.00s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.000 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = -1.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000 Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.000 Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.000