Open darcula1993 opened 2 years ago
Thanks for your feedback. Could you please provide the command that raised this error so we can locate the problem?
I just came to GitHub to report similar/related problems. I tested with:
python demo/body3d_two_stage_video_demo.py demo/mmdetection_cfg/faster_rcnn_r50_fpn_coco.py https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w48_coco_256x192.py https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w48_coco_256x192-b9e0b3ab_20200708.pth configs/body/3d_kpt_sview_rgb_vid/video_pose_lift/mpi_inf_3dhp/videopose3d_mpi-inf-3dhp_1frame_fullconv_supervised_gt.py https://download.openmmlab.com/mmpose/body3d/videopose/videopose_mpi-inf-3dhp_1frame_fullconv_supervised_gt-d6ed21ef_20210603.pth --video-path VIDEO.mp4 --rebase-keypoint-height --show
and got:
Stage 1: 2D pose detection.
Initializing model...
load checkpoint from http path: https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
load checkpoint from http path: https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w48_coco_256x192-b9e0b3ab_20200708.pth
Running 2D pose detection inference...
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ] 239/240, 3.8 task/s, elapsed: 63s, ETA: 0s
Stage 2: 2D-to-3D pose lifting.
Initializing model...
load checkpoint from http path: https://download.openmmlab.com/mmpose/body3d/videopose/videopose_mpi-inf-3dhp_1frame_fullconv_supervised_gt-d6ed21ef_20210603.pth
Traceback (most recent call last):
File "/home/allgeuer/Programs/DeepStack/envs/mmdev/mmpose/demo/body3d_two_stage_video_demo.py", line 377, in <module>
main()
File "/home/allgeuer/Programs/DeepStack/envs/mmdev/mmpose/demo/body3d_two_stage_video_demo.py", line 290, in main
res['keypoints'] = convert_keypoint_definition(
File "/home/allgeuer/Programs/DeepStack/envs/mmdev/mmpose/demo/body3d_two_stage_video_demo.py", line 62, in convert_keypoint_definition
raise NotImplementedError
NotImplementedError
A quick check of convert_keypoint_definition()
shows that the only 3D keypoint definition that is supported is Body3DH36MDataset
, which is incompatible with videopose3d. This should be fixed.
I have my own code that uses the MMPose API to bypass this with a manual conversion, and then you get the problem with the following lines of code from mmpose/apis/inference_3d.py
:
assert 'stats_info' in dataset_info._dataset_info
bbox_center = dataset_info._dataset_info['stats_info']['bbox_center']
bbox_scale = dataset_info._dataset_info['stats_info']['bbox_scale']
The problem is that stats_info
is not defined in the configs/_base_/datasets/mpi_inf_3dhp.py
file. As a quick fix I just added the following line (copied from H36M):
stats_info=dict(bbox_center=(528., 427.), bbox_scale=400.),
Aside from implementing the required keypoint definition conversion that I already pointed out above, could you guys please compute the mean bbox center and scale for the mpi_inf_3dhp
dataset and add a correct stats_info
line as I have exampled?
this is my command:
python demo/body3d_two_stage_img_demo.py \
configs/body/3d_kpt_sview_rgb_vid/video_pose_lift/h36m/videopose3d_h36m_1frame_fullconv_supervised_cpn_ft.py \
/lixinwei/mmpose/work_dirs/videopose3d_h36m_1frame_fullconv_supervised_cpn_ft/best_MPJPE_epoch_150.pth \
--json-file tests/data/h36m/h36m_coco.json \
--img-root tests/data/h36m \
--camera-param-file tests/data/h36m/cameras.pkl \
--only-second-stage \
--out-img-root vis_results \
--rebase-keypoint-height \
--show-ground-truth
@darcula1993 I have tried your command and encountered the same problem.
Actually, the pipelines for image and single-frame in the config file are slightly different. For example, you can check these two config files:
videopose3d_h36m_1frame_fullconv_supervised_cpn_ft.py
for more details.
So if you want to run image demo, you'd better use the config file from this folder: https://github.com/open-mmlab/mmpose/tree/master/configs/body/3d_kpt_sview_rgb_img/pose_lift.
I think what both of us were trying to achieve is to run the videopose3d model that only uses one frame at a time as input, on video input. This is not the same as trying to run that model on a single input image. With a different config the model may run fine as part of the image demo, but that's not what we are trying to do.
I have quite explicitly documented in my previous post what it would take to resolve the errors I have seen - is there any update on that?
@pallgeuer Thanks for your interest in this issue.
I think the main concern of this issue is if we can use videopose3d one-frame model to do inference on image input, but not video input.
Actually the model for image demo should be SimpleBaseline3D
and the model for video demo should be VideoPose3D
, but they are unified implemented as PoseLifter
in mmpose, which may cause some confusion.
So my advice is that you'd better run the image demo using the SimpleBaseline3D
model listed in this folder: https://github.com/open-mmlab/mmpose/tree/master/configs/body/3d_kpt_sview_rgb_img/pose_lift.
A quick check of
convert_keypoint_definition()
shows that the only 3D keypoint definition that is supported isBody3DH36MDataset
, which is incompatible with videopose3d. This should be fixed.
Here, you have pointed out that we can only run the demo script on Body3DH36MDataset
and can not run on other datasets like Body3DMpiInf3dhpDataset
, due to the limit of convert_keypoint_definition
function.
We will check the demo script asap. You can also raise another issue for this discussion.
Okay, yes, rechecking the original issue it was indeed related to image, not video, my mistake.
As a goal for mmpose though, I guess there is no reason why it shouldn't be made possible to run it on either an image or video input. I haven't tried image, but I have indeed successfully run it on video, with the simple one line change I mentioned in my first post, in addition to fixing convert_keypoint_definition
of course. So it seems like it could feasibly be made to work for both image and video without a great amount of change or effort. I guess that's what I was trying to say.
@pallgeuer Thanks for your nice suggestion! Would you like to raise a PR to fix this problem you mentioned?
According to the model architecture, the videopose3d_1frame model should be able to do image 2d to 3d lift inference. But when I try, I got the error
where this key come from?