ViTAE-Transformer / ViTPose

The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"
Apache License 2.0
1.34k stars 181 forks source link

Assertion Error During Testing #130

Open dianchia opened 7 months ago

dianchia commented 7 months ago

Did you check docs and existing issues?

Version Information

>>> python -V
Python 3.7.12
mmpose 0.24.0
mmcv-full 1.3.9
Click for full version info ```bash addict 2.4.0 certifi 2023.11.17 charset-normalizer 3.3.2 chumpy 0.70 cycler 0.11.0 Cython 3.0.8 einops 0.6.1 fonttools 4.38.0 idna 3.6 importlib-metadata 6.7.0 json-tricks 3.17.3 kiwisolver 1.4.5 matplotlib 3.5.3 mmcv-full 1.3.9 $HOME/projects/pose_estimation/ViTPose/mmcv mmpose 0.24.0 $HOME/projects/pose_estimation/ViTPose/ViTPose munkres 1.1.4 numpy 1.21.6 nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 opencv-python 4.9.0.80 packaging 23.2 Pillow 9.5.0 pip 23.3.2 platformdirs 4.0.0 pyparsing 3.1.1 python-dateutil 2.8.2 PyYAML 6.0.1 requests 2.31.0 scipy 1.7.3 setuptools 69.0.3 six 1.16.0 timm 0.4.9 tomli 2.0.1 torch 1.13.1 torchvision 0.14.1 typing_extensions 4.7.1 urllib3 2.0.7 wheel 0.42.0 xtcocotools 1.14.3 yapf 0.40.2 zipp 3.15.0 ```

Operating System

Ubuntu

Describe the bug

AssertionError was raised then testing using the script tools/dist_test.sh. A shorter version of error is included below.

File "tools/test.py", line 184, in <module>
    main()
  File "tools/test.py", line 167, in main
    args.gpu_collect)
  File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/apis/test.py", line 70, in multi_gpu_test
    result = model(return_loss=False, **data)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 1040, in forward
    output = self._run_ddp_forward(*inputs, **kwargs)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 1000, in _run_ddp_forward
    return module_to_run(*inputs[0], **kwargs[0])
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "$HOME/projects/pose_estimation/ViTPose/mmcv/mmcv/runner/fp16_utils.py", line 98, in new_func
    return old_func(*args, **kwargs)
  File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/models/detectors/top_down.py", line 141, in forward
    img, img_metas, return_heatmap=return_heatmap, **kwargs)
  File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/models/detectors/top_down.py", line 165, in forward_test
    assert img.size(0) == len(img_metas)
AssertionError
Click for full error message ```bash $HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launch.py:188: FutureWarning: The module torch.distributed.launch is deprecated and will be removed in future. Use torchrun. Note that --use_env is set by default in torchrun. If your script expects `--local_rank` argument to be set, please change it to read from `os.environ['LOCAL_RANK']` instead. See https://pytorch.org/docs/stable/distributed.html#launch-utility for further instructions FutureWarning, apex is not installed apex is not installed apex is not installed $HOME/projects/pose_estimation/ViTPose/mmcv/mmcv/cnn/bricks/transformer.py:27: UserWarning: Fail to import ``MultiScaleDeformableAttention`` from ``mmcv.ops.multi_scale_deform_attn``, You should install ``mmcv-full`` if you need this module. warnings.warn('Fail to import ``MultiScaleDeformableAttention`` from ' $HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/utils/setup_env.py:33: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. f'Setting OMP_NUM_THREADS environment variable for each process ' $HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/utils/setup_env.py:43: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. f'Setting MKL_NUM_THREADS environment variable for each process ' loading annotations into memory... Done (t=1.00s) creating index... index created! => Total boxes: 104125 => Total boxes after filter low score@0.0: 104125 => num_images: 5000 => load 104125 samples Use load_from_local loader The model and loaded state dict do not match exactly unexpected key in source state_dict: backbone.blocks.0.mlp.experts.0.weight, backbone.blocks.0.mlp.experts.0.bias, backbone.blocks.0.mlp.experts.1.weight, backbone.blocks.0.mlp.experts.1.bias, backbone.blocks.0.mlp.experts.2.weight, backbone.blocks.0.mlp.experts.2.bias, backbone.blocks.0.mlp.experts.3.weight, backbone.blocks.0.mlp.experts.3.bias, backbone.blocks.0.mlp.experts.4.weight, backbone.blocks.0.mlp.experts.4.bias, backbone.blocks.0.mlp.experts.5.weight, backbone.blocks.0.mlp.experts.5.bias, backbone.blocks.1.mlp.experts.0.weight, backbone.blocks.1.mlp.experts.0.bias, backbone.blocks.1.mlp.experts.1.weight, backbone.blocks.1.mlp.experts.1.bias, backbone.blocks.1.mlp.experts.2.weight, backbone.blocks.1.mlp.experts.2.bias, backbone.blocks.1.mlp.experts.3.weight, backbone.blocks.1.mlp.experts.3.bias, backbone.blocks.1.mlp.experts.4.weight, backbone.blocks.1.mlp.experts.4.bias, backbone.blocks.1.mlp.experts.5.weight, backbone.blocks.1.mlp.experts.5.bias, backbone.blocks.2.mlp.experts.0.weight, backbone.blocks.2.mlp.experts.0.bias, backbone.blocks.2.mlp.experts.1.weight, backbone.blocks.2.mlp.experts.1.bias, backbone.blocks.2.mlp.experts.2.weight, backbone.blocks.2.mlp.experts.2.bias, backbone.blocks.2.mlp.experts.3.weight, backbone.blocks.2.mlp.experts.3.bias, backbone.blocks.2.mlp.experts.4.weight, backbone.blocks.2.mlp.experts.4.bias, backbone.blocks.2.mlp.experts.5.weight, backbone.blocks.2.mlp.experts.5.bias, backbone.blocks.3.mlp.experts.0.weight, backbone.blocks.3.mlp.experts.0.bias, backbone.blocks.3.mlp.experts.1.weight, backbone.blocks.3.mlp.experts.1.bias, backbone.blocks.3.mlp.experts.2.weight, backbone.blocks.3.mlp.experts.2.bias, backbone.blocks.3.mlp.experts.3.weight, backbone.blocks.3.mlp.experts.3.bias, backbone.blocks.3.mlp.experts.4.weight, backbone.blocks.3.mlp.experts.4.bias, backbone.blocks.3.mlp.experts.5.weight, backbone.blocks.3.mlp.experts.5.bias, backbone.blocks.4.mlp.experts.0.weight, backbone.blocks.4.mlp.experts.0.bias, backbone.blocks.4.mlp.experts.1.weight, backbone.blocks.4.mlp.experts.1.bias, backbone.blocks.4.mlp.experts.2.weight, backbone.blocks.4.mlp.experts.2.bias, backbone.blocks.4.mlp.experts.3.weight, backbone.blocks.4.mlp.experts.3.bias, backbone.blocks.4.mlp.experts.4.weight, backbone.blocks.4.mlp.experts.4.bias, backbone.blocks.4.mlp.experts.5.weight, backbone.blocks.4.mlp.experts.5.bias, backbone.blocks.5.mlp.experts.0.weight, backbone.blocks.5.mlp.experts.0.bias, backbone.blocks.5.mlp.experts.1.weight, backbone.blocks.5.mlp.experts.1.bias, backbone.blocks.5.mlp.experts.2.weight, backbone.blocks.5.mlp.experts.2.bias, backbone.blocks.5.mlp.experts.3.weight, backbone.blocks.5.mlp.experts.3.bias, backbone.blocks.5.mlp.experts.4.weight, backbone.blocks.5.mlp.experts.4.bias, backbone.blocks.5.mlp.experts.5.weight, backbone.blocks.5.mlp.experts.5.bias, backbone.blocks.6.mlp.experts.0.weight, backbone.blocks.6.mlp.experts.0.bias, backbone.blocks.6.mlp.experts.1.weight, backbone.blocks.6.mlp.experts.1.bias, backbone.blocks.6.mlp.experts.2.weight, backbone.blocks.6.mlp.experts.2.bias, backbone.blocks.6.mlp.experts.3.weight, backbone.blocks.6.mlp.experts.3.bias, backbone.blocks.6.mlp.experts.4.weight, backbone.blocks.6.mlp.experts.4.bias, backbone.blocks.6.mlp.experts.5.weight, backbone.blocks.6.mlp.experts.5.bias, backbone.blocks.7.mlp.experts.0.weight, backbone.blocks.7.mlp.experts.0.bias, backbone.blocks.7.mlp.experts.1.weight, backbone.blocks.7.mlp.experts.1.bias, backbone.blocks.7.mlp.experts.2.weight, backbone.blocks.7.mlp.experts.2.bias, backbone.blocks.7.mlp.experts.3.weight, backbone.blocks.7.mlp.experts.3.bias, backbone.blocks.7.mlp.experts.4.weight, backbone.blocks.7.mlp.experts.4.bias, backbone.blocks.7.mlp.experts.5.weight, backbone.blocks.7.mlp.experts.5.bias, backbone.blocks.8.mlp.experts.0.weight, backbone.blocks.8.mlp.experts.0.bias, backbone.blocks.8.mlp.experts.1.weight, backbone.blocks.8.mlp.experts.1.bias, backbone.blocks.8.mlp.experts.2.weight, backbone.blocks.8.mlp.experts.2.bias, backbone.blocks.8.mlp.experts.3.weight, backbone.blocks.8.mlp.experts.3.bias, backbone.blocks.8.mlp.experts.4.weight, backbone.blocks.8.mlp.experts.4.bias, backbone.blocks.8.mlp.experts.5.weight, backbone.blocks.8.mlp.experts.5.bias, backbone.blocks.9.mlp.experts.0.weight, backbone.blocks.9.mlp.experts.0.bias, backbone.blocks.9.mlp.experts.1.weight, backbone.blocks.9.mlp.experts.1.bias, backbone.blocks.9.mlp.experts.2.weight, backbone.blocks.9.mlp.experts.2.bias, backbone.blocks.9.mlp.experts.3.weight, backbone.blocks.9.mlp.experts.3.bias, backbone.blocks.9.mlp.experts.4.weight, backbone.blocks.9.mlp.experts.4.bias, backbone.blocks.9.mlp.experts.5.weight, backbone.blocks.9.mlp.experts.5.bias, backbone.blocks.10.mlp.experts.0.weight, backbone.blocks.10.mlp.experts.0.bias, backbone.blocks.10.mlp.experts.1.weight, backbone.blocks.10.mlp.experts.1.bias, backbone.blocks.10.mlp.experts.2.weight, backbone.blocks.10.mlp.experts.2.bias, backbone.blocks.10.mlp.experts.3.weight, backbone.blocks.10.mlp.experts.3.bias, backbone.blocks.10.mlp.experts.4.weight, backbone.blocks.10.mlp.experts.4.bias, backbone.blocks.10.mlp.experts.5.weight, backbone.blocks.10.mlp.experts.5.bias, backbone.blocks.11.mlp.experts.0.weight, backbone.blocks.11.mlp.experts.0.bias, backbone.blocks.11.mlp.experts.1.weight, backbone.blocks.11.mlp.experts.1.bias, backbone.blocks.11.mlp.experts.2.weight, backbone.blocks.11.mlp.experts.2.bias, backbone.blocks.11.mlp.experts.3.weight, backbone.blocks.11.mlp.experts.3.bias, backbone.blocks.11.mlp.experts.4.weight, backbone.blocks.11.mlp.experts.4.bias, backbone.blocks.11.mlp.experts.5.weight, backbone.blocks.11.mlp.experts.5.bias [ ] 0/104125, elapsed: 0s, ETA:Traceback (most recent call last): File "tools/test.py", line 184, in main() File "tools/test.py", line 167, in main args.gpu_collect) File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/apis/test.py", line 70, in multi_gpu_test result = model(return_loss=False, **data) File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 1040, in forward output = self._run_ddp_forward(*inputs, **kwargs) File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 1000, in _run_ddp_forward return module_to_run(*inputs[0], **kwargs[0]) File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "$HOME/projects/pose_estimation/ViTPose/mmcv/mmcv/runner/fp16_utils.py", line 98, in new_func return old_func(*args, **kwargs) File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/models/detectors/top_down.py", line 141, in forward img, img_metas, return_heatmap=return_heatmap, **kwargs) File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/models/detectors/top_down.py", line 165, in forward_test assert img.size(0) == len(img_metas) AssertionError ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 3740731) of binary: $HOME/miniforge3/envs/vitpose/bin/python Traceback (most recent call last): File "$HOME/miniforge3/envs/vitpose/lib/python3.7/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "$HOME/miniforge3/envs/vitpose/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launch.py", line 195, in main() File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launch.py", line 191, in main launch(args) File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launch.py", line 176, in launch run(args) File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/run.py", line 756, in run )(*cmd_args) File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 132, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 248, in launch_agent failures=result.failures, torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ tools/test.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2024-02-12_17:58:43 host : host rank : 0 (local_rank: 0) exitcode : 1 (pid: 3740731) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ```

Steps to reproduce

  1. Clone the repository with git clone https://github.com/ViTAE-Transformer/ViTPose.git --depth 1
  2. Follow the installation instruction in README.md
  3. Download dataset from coco-dataset official website. To be specific, the 2017 Train/Val/Test Images.
  4. Put the downloaded images into ./data/coco/ and unzip all of them.
  5. Download the annotation files from here and put it into ./data/coco/annotations/
  6. Download any of the wholebody pretrained model
  7. Start testing with this command bash tools/dist_test.sh configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/ViTPose_base_wholebody_256x192.py pretrained/wholebody.pth 1

Expected behaviour

Expected the testing to run smoothly without errors.

LancasterLi commented 6 months ago

I correct this error by setting all "img_metas" to "img_metas.data[0]" in ./detectors/top_down.py