PINTO0309 / PINTO_model_zoo

A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8), EdgeTPU, CoreML.
https://qiita.com/PINTO
MIT License
3.5k stars 566 forks source link

Questions about BlazePose #97

Closed vladmandic closed 3 years ago

vladmandic commented 3 years ago

Questions about BlazePose

blazepose consists of three models:

Issue 1: All MediaPipe docs refer to older version of model with 33 keypoints instead of 39

Does anyone have a definite list of keypoint annotations? I think I figured it out with trial&error for full body model, but upper body model doesn't match.

Issue 2: Keypoints detectors work fine if body is centered in frame - that's why there is detector model to start with.

But, what is the output of detector model? Two tensors with shapes [1, 896] and [896, 12].

My best guess is to run argMax on first tensor and use that to extract 12 values from second tensor as 6 points. And still have no idea what those are.

All I could find is this:

... we trained a face detector, inspired by our sub-millisecond BlazeFace model, as a proxy for a pose detector. ... two additional virtual keypoints that firmly describe the human body center, rotation and scale as a circle.

Any clue?

I've tried rescaling and plotting them, but they don't make much sense.

All the implementations I could find in the wild do not utilize detector, they only do keypoints.

Btw, if anyone wants to see my test implementation (like I said, full and upper decoders are working nicely), it's at:
https://github.com/vladmandic/blazepose

vladmandic commented 3 years ago

found it in https://github.com/geaxgx/depthai_blazepose

ovshake commented 1 year ago

Hi @vladmandic , I tried plotting the x,y,z as well as just the x,y. (x,y) does match with the image, but x,y,z doesn't match. Is there any preprocessing required to get the correct z coordinate?