Open mishizhiyou opened 4 years ago
@yukihiko さん @h-nakae さん 素晴らしいリアルタイム3D関節推定モデルをありがとうございます! 指関節も取れるように、mediapipeに下記onnxモデルを組み込もうと思っています。 https://digital-standard.com/threedpose/models/Resnet34_3inputs_448x448_20200609.onnx
このonnxモデルで関節のカメラ座標上の3次元キーポイントを取りたいのですが、 下記リンクのPos3Dがカメラ座標という認識でいいでしょうか? また、キャリブレーション不要のようですが、 カメラの内部パラメータに関してはどのような仮定を置いてますか? https://github.com/digital-standard/ThreeDPoseTracker/blob/master/Assets/Scripts/VNectModel.cs#L64-L69
I need to calculate 3d keypoints in camera coordinate, I guess that Pos3D in the below link is in the camera coordinate. Is it correct? https://github.com/digital-standard/ThreeDPoseTracker/blob/master/Assets/Scripts/VNectModel.cs#L64-L69
Hi @mishizhiyou (I think I answered the exact same question somewhere.) The coordinate values for the posture estimation from the machine learning model are based on the position on the image. So (x,y) will be close to the position on the image. The depth is based on the position of the person's waist which we are estimating. The depth is relative to the position of the person's waist, and cannot be converted to world coordinates.
@xiong-jie-y さんこんにちは MultiHandTrackingですね。私も途中までやってそのままです。
下記の資料を見てください。以前イベントで話した資料なのでたぶんまだ見れると思います。 https://docs.google.com/presentation/d/1pT9cb-FA2lO60hP2iWOMkUe8d29TFCMJ/edit#slide=id.p1 これの10ページが該当の個所になると思います。 この図はDataSetの図ですが出力はこの逆なだけなので同じ話になります。Unityちゃんの周りの濃い黒の四角形が入力画像448x448とした場合、薄いグレーの線の448x448x448の立方体で出力です(奥行きは-224~224)。 Now3Dがモデルからの出力の直値でPos3Dがフィルタをある程度かけた値です。 アバターの動きは腰の位置以外は全て角度(方向)しか使っていません(LookRotationだけです)。
@yukihiko さん カメラ座標を得たい場合はデプス推定と組み合わせたり、腰のカメラ座標推定を作る必要がありそうですね。 わかりやすい説明ありがとうございます!
Hi @yukihiko, Thanks for your explanation, it's clean and helpful. I have another question, If we use x and y in image coordinate, how do we drive 3d model in world coordinate? I think a 3d model need transform and rotation in world coordinate
Is it possible that the world coordinate system is a coordinate system in Unity? If it's a skeleton, you can just plot the joint coordinates as is (there's a scale adjustment, but that's what the skeleton in this code is) and you're good to go. If the person is farther away, the person in the image will appear smaller. The skeleton in the output at this time is simply smaller because the estimate is an image-based coordinate system. It does not move forward or backward. When moving the avatar, the forward and backward movements are pseudo-calculated from changes in height. The movements of the arms and legs are based on the angles of the joints.
@xiong-jie-y Did you tried the mediapipe alternative to this?
Is it possible that the world coordinate system is a coordinate system in Unity? If it's a skeleton, you can just plot the joint coordinates as is (there's a scale adjustment, but that's what the skeleton in this code is) and you're good to go. If the person is farther away, the person in the image will appear smaller. The skeleton in the output at this time is simply smaller because the estimate is an image-based coordinate system. It does not move forward or backward. When moving the avatar, the forward and backward movements are pseudo-calculated from changes in height. The movements of the arms and legs are based on the angles of the joints.
Its a brilliant solution, if you could give code snippet references from your repository for this, it would be much helpful. @yukihiko
Hi @yukihiko @h-nakae @digista-tanaka, Thanks for this awesome project!
I check the project and find that the learned location of joints are in pixel coordinate. Can you please tell me how can I convert the location in pixel coordinate to world coordinate? I tried to convert locations using standard position transform in pixel coordinate -> camera coordinate -> world coordinate, I get right
x
andy
values, but thez
value is wrong. So I guess these could be something I have missed. Could you please give an advice to go on this? Thanks in advance.