Open RojanAsl opened 7 months ago
I have the same error, have you solved it yet?
I have the same error, have you solved it yet?
I changed the detection model to "rtmdet_m_8xb32-300e_coco" and now it detects the humans in frame better. But I still get the warning.
Prerequisite
Environment
OrderedDict([('sys.platform', 'win32'), ('Python', '3.8.18 (default, Sep 11 2023, 13:39:12) [MSC v.1916 64 bit (AMD64)]'), ('CUDA available', False), ('numpy_random_seed', 2147483648), ('MSVC', 'Microsoft (R) C/C++ Optimizing Compiler Version 19.34.31942 for x64'), ('GCC', 'n/a'), ('PyTorch', '2.1.0'), ('PyTorch compiling details', 'PyTorch built with:\n - C++ Version: 199711\n - MSVC 192930151\n - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications\n - Intel(R) MKL-DNN v3.1.1 (Git Hash 64f6bcbcbab628e96f33a62c3e975f8535a7bde4)\n - OpenMP 2019\n - LAPACK is enabled (usually provided by MKL)\n - CPU capability usage: AVX512\n - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CXX_COMPILER=C:/cb/pytorch_1000000000000/work/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /bigobj /FS -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /utf-8 /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=OFF, TORCH_VERSION=2.1.0, USE_CUDA=0, USE_CUDNN=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF, \n'), ('TorchVision', '0.16.0'), ('OpenCV', '4.8.1'), ('MMEngine', '0.9.0'), ('MMPose', '1.2.0+')])
Reproduces the problem - code sample
inferencer = Pose2DInferencer(model = "td-hm_ViTPose-huge_8xb64-210e_coco-256x192", device = my_device, det_model = "yolov3_d53_320_273e_coco", scope = "mmpose" )
Reproduces the problem - command or script
from mmpose.apis.inferencers.pose2d_inferencer import Pose2DInferencer from mmpose.utils import register_all_modules import torch import os
register_all_modules()
my_device = torch.device("cuda" if torch.cuda.is_available() else "cpu") print(my_device)
inferencer = Pose2DInferencer(model = "td-hm_ViTPose-huge_8xb64-210e_coco-256x192", device = my_device, det_model = "yolov3_d53_320_273e_coco", scope = "mmpose" )
video_path = os.path.normpath(r".\data\input\VID.avi") out_dir= r".\data\output"
result_generator = inferencer(video_path, out_dir=out_dir, show = True,
kpt_thr = 0.6,
results = [result for result in result_generator] result = next(result_generator)
print("EOF.")
Reproduces the problem - error message
cpu Loads checkpoint by http backend from path: https://download.openmmlab.com/mmpose/v1/body_2d_keypoint/topdown_heatmap/coco/td-hm_ViTPose-huge_8xb64-210e_coco-256x192-e32adcd4_20230314.pth The model and loaded state dict do not match exactly
unexpected key in source state_dict: backbone.cls_token
03/05 18:18:49 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "function" registry tree. As a workaround, the current "function" registry in "mmengine" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized. Loads checkpoint by http backend from path: https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_320_273e_coco/yolov3_d53_320_273e_coco-421362b6.pth 03/05 18:18:51 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "function" registry tree. As a workaround, the current "function" registry in "mmengine" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized.
Additional information
The data I am using is AVI video files.
I think it is causing the model not to detect the bounding box (bbox) correctly, since it is always assuming the whole frame as the bbox and not just the person.