Walter0807 / MotionBERT

[ICCV 2023] PyTorch Implementation of "MotionBERT: A Unified Perspective on Learning Human Motion Representations"
Apache License 2.0
1.01k stars 123 forks source link

In-the-wild Inference input format #16

Closed RHnejad closed 1 year ago

RHnejad commented 1 year ago

Hello and thanks for sharing your code. May I please ask about the structure of the .json file needed for the In-the-wild Inference for 3D pose estimation? I want to use 2d estimations of another network other than AlphaPose and was not sure how to structure my 2d poses so it's compatible with your code. Thanks in advance for your help.

fan23j commented 1 year ago

This worked for me:

[ { "image_id": "0.jpg", "category_id": 1, "keypoints": [ 557.1444091796875, 734.3290405273438, 0.5740299224853516, ... ], "score": 1.8695456981658936 }, { "image_id": "1.jpg", ...

Where each entry to the json represents a prediction frame. Since the authors convert halpe to h36m, the keypoints entry is length 78 (26 x [x,y,score]).

RHnejad commented 1 year ago

This worked for me:

[ { "image_id": "0.jpg", "category_id": 1, "keypoints": [ 557.1444091796875, 734.3290405273438, 0.5740299224853516, ... ], "score": 1.8695456981658936 }, { "image_id": "1.jpg", ...

Where each entry to the json represents a prediction frame. Since the authors convert halpe to h36m, the keypoints entry is length 78 (26 x [x,y,score]).

Thank you very much for your answer.

ywdong commented 1 year ago

it's quite hard for me to install alphapose, can't it be any other 2d pose estimation model?

fan23j commented 1 year ago

Any pose estimation model should work as long as you properly format the output json. I personally used yolov5 + ViTPose topdown pipeline.

ywdong commented 1 year ago

Any pose estimation model should work as long as you properly format the output json. I personally used yolov5 + ViTPose topdown pipeline. Hi,fan23j,Can u share me your script how to use the vitpose output to generate the input format?Thx

fan23j commented 1 year ago

Any pose estimation model should work as long as you properly format the output json. I personally used yolov5 + ViTPose topdown pipeline. Hi,fan23j,Can u share me your script how to use the vitpose output to generate the input format?Thx

sure give me some time to upload to github

fan23j commented 1 year ago

Here you go. Let me know if you run into any issues. https://github.com/fan23j/yolov5-vitpose-video-annotator

Walter0807 commented 1 year ago

Any pose estimation model should work as long as you properly format the output json. I personally used yolov5 + ViTPose topdown pipeline.

That's correct, thanks Jack! For other 2D pose estimators (and keypoint formats), you only need to change line68-line77 in https://github.com/Walter0807/MotionBERT/blob/main/lib/data/dataset_wild.py (read the 2D results and convert to H36M format).

Thanks for your interest in our work, please let me know if you have further questions.