homuler / MediaPipeUnityPlugin

Unity plugin to run MediaPipe
MIT License
1.79k stars 465 forks source link

How to scale Landmark points relative to the texture size in v0.8.0 #367

Closed Arham-Aalam closed 2 years ago

Arham-Aalam commented 2 years ago

Hi @homuler , I am using pose tracking solution and want to scale the resulting Normalized landmarks. I saw your are using this code and the scaling vector is (100,100,100) but when I scale them myself using GetLocalPositionNormalized(...) with same scale I am not getting correct mapping over the canvas. I also try to scale it using texture width and height:

textureWidth = textureFrame.width;
textureHeight = textureFrame.height;

Is there anything additionally you are doing or should I change scaling vector?

Scaling in V0.0.6 was much easier, like this example I made few weeks ago.

Thanks, Arham

homuler commented 2 years ago

the scaling vector is (100,100,100)

Do you mean this code? https://github.com/homuler/MediaPipeUnityPlugin/blob/a3b90d13eac636ff837d52723eaa6f0046adb459/Packages/com.github.homuler.mediapipe/Runtime/Scripts/Unity/Annotation/PoseWorldLandmarkListAnnotationController.cs#L15

Arham-Aalam commented 2 years ago

Yes

homuler commented 2 years ago

PoseWorldLandmarkListAnnotationController is used to annotate pose_world_landmarks, which is LandmarkList, not NormalizedLandmarkList. pose_world_landmarks uses real-world 3D coordinates in meters and not proportional to the size of the input image. This is why PoseWorldLandmarkListAnnotationController has predefined _scale vector.

x, y and z: Real-world 3D coordinates in meters with the origin at the center between hips.

If you want to know the position on the input image, you need to use pose_landmarks.

x and y: Landmark coordinates normalized to [0.0, 1.0] by the image width and height respectively.

The current ImageCoordinate API is not very good and requires a RectTransform argument (receiver to be precise) to represent the dimension of the input image. https://github.com/homuler/MediaPipeUnityPlugin/blob/bed2f478eddf3b8be106adcc2044e576e8337e22/Packages/com.github.homuler.mediapipe/Runtime/Scripts/Unity/CoordinateSystem/ImageCoordinate.cs#L271-L282

Arham-Aalam commented 2 years ago

Got it so (100,100,100) is used only for world landmarks, but where did you pass RectTransform rectTransform in GetLocalPosition function?

Here you just passed the target, rotationAngle and isMirrored. if I get to know the value of rectTransform then I think this will work for me as well. https://github.com/homuler/MediaPipeUnityPlugin/blob/a3b90d13eac636ff837d52723eaa6f0046adb459/Packages/com.github.homuler.mediapipe/Runtime/Scripts/Unity/Annotation/PointAnnotation.cs#L65

The plugin needs some documentation, let me know how I can contribute for this?

Arham-Aalam commented 2 years ago

Looks like the RectTransform is coming from here: https://github.com/homuler/MediaPipeUnityPlugin/blob/a3b90d13eac636ff837d52723eaa6f0046adb459/Packages/com.github.homuler.mediapipe/Runtime/Scripts/Unity/Annotation/HierarchicalAnnotation.cs#L35

Could you please explain how I can define or calculate my own RectTransform? Thanks

homuler commented 2 years ago

If it's OK, please just add RectTransform to the GameObject where the input image is displayed. In the sample app, RectTransform is attached to Annotatable Screen from Add Component menu (note that RawImage is also attached to it). Screenshot_20211203_191411

As for the annotation, the child GameObject called Annotation Layer also has a RectTransform attached to it, which is actually referenced by Hierarchical Annotation (but in your case, I think you don't need to create a child GameObject like this). https://github.com/homuler/MediaPipeUnityPlugin/blob/a3b90d13eac636ff837d52723eaa6f0046adb459/Packages/com.github.homuler.mediapipe/Runtime/Scripts/Unity/Annotation/AnnotationController.cs#L11-L19 https://github.com/homuler/MediaPipeUnityPlugin/blob/a3b90d13eac636ff837d52723eaa6f0046adb459/Packages/com.github.homuler.mediapipe/Runtime/Scripts/Unity/Annotation/AnnotationController.cs#L58-L64

However, as I wrote in https://github.com/homuler/MediaPipeUnityPlugin/issues/367#issuecomment-985094266, the current API is not very good, so I'm planning to change it so that we can call like this:

// var position = GetAnnotationLayer().GetLocalPosition(target, rotationAngle, isMirrored);
var position = target.GetLocalPosition(inputWidth, inputHeight, rotationAngle, isMirrored);
Arham-Aalam commented 2 years ago

I think I am doing something wrong here in scaling: Here is what I did in PoseTracking solution: in Run function:

rectTransform = GameObject.Find("Annotatable Screen").GetComponent<RectTransform>();

Then once I got the NormalizedLandmark I scale them using these functions using rectTransform:

// Inside posetracking solution (Ignore scale parameter)
 landmarkModels = PoseUtils.prepareLandmarks(normalizedLandmarkList, new Vector3(100, 100, 1), rectTransform);

// Inside poseUtils
public static List<LandmarkModel> prepareWorldLandmarks(LandmarkList normalizedLandmarkList, Vector3 scale)
        {
            List<LandmarkModel> list = new List<LandmarkModel>();
            if (normalizedLandmarkList == null || normalizedLandmarkList.Landmark.Count == 0) return list;

            for (int i = 0; i < normalizedLandmarkList.Landmark.Count; ++i)
            {
                Vector3 point = GetLocalPosition(normalizedLandmarkList.Landmark[i].X, normalizedLandmarkList.Landmark[i].Y, normalizedLandmarkList.Landmark[i].Z,
                    scale);
                list.Add(new LandmarkModel(point.x, point.y, point.z, normalizedLandmarkList.Landmark[i].Visibility, normalizedLandmarkList.Landmark[i].Presence));
            }

            return list;
        }

public static Vector3 GetLocalPositionNormalized(RectTransform rectTransform, float normalizedX, float normalizedY, float normalizedZ, float zScale, RotationAngle imageRotation = RotationAngle.Rotation0, bool isMirrored = false)
        {
            var rect = rectTransform.rect;
            var isInverted = IsInverted(imageRotation);
            var (nx, ny) = isInverted ? (normalizedY, normalizedX) : (normalizedX, normalizedY);
            var x = IsXReversed(imageRotation, isMirrored) ? Mathf.LerpUnclamped(rect.xMax, rect.xMin, nx) : Mathf.LerpUnclamped(rect.xMin, rect.xMax, nx);
            var y = IsYReversed(imageRotation, isMirrored) ? Mathf.LerpUnclamped(rect.yMax, rect.yMin, ny) : Mathf.LerpUnclamped(rect.yMin, rect.yMax, ny);
            var z = zScale * normalizedZ;
            return new Vector3(x, y, z);
        }

        public static Vector3 GetLocalPositionNormalized(RectTransform rectTransform, float normalizedX, float normalizedY, float normalizedZ, RotationAngle imageRotation = RotationAngle.Rotation0, bool isMirrored = false)
        {
            // Z usually uses roughly the same scale as X
            var zScale = IsInverted(imageRotation) ? rectTransform.rect.height : rectTransform.rect.width;
            return GetLocalPositionNormalized(rectTransform, normalizedX, normalizedY, normalizedZ, zScale, imageRotation, isMirrored);
        }

these functions returning the position based on the world coordinate (inside 3D Room). Am I using the correct functions to scale my normalized coordinates?

homuler commented 2 years ago

Will you describe more clearly

P.S. You don't need to copy and paste GetLocalPositionNormalized. Just add using Mediapipe.Unity.CoordinateSystem; to the source file and you can call it like rectTransform.GetLocalPosition(...).

Arham-Aalam commented 2 years ago

what you want to do? I am trying to scale the landmarks as you did with Point annotations.

what you're doing now? I am trying to scale landmark points using your utility functions, in PoseUtils.prepareLandmarks I am looping through the landmarks and then scaling them using GetLocalPositionNormalized function then returning the scaled list. I have just updated the code for PoseUtils#prepareLandmarks in the above thread.

what's currently not working as expected? My landmark points are not scaling as expected.

I tried this approach as you said:

// in PoseTrackingSolution
rectTransform = GameObject.Find("Annotatable Screen").GetComponent<RectTransform>();
landmarkModels = PoseUtils.prepareLandmarks(normalizedLandmarkList, rectTransform);

// In my PoseUtils
public static List<LandmarkModel> prepareLandmarks(NormalizedLandmarkList normalizedLandmarkList, RectTransform rectTransform)
        {
            List<LandmarkModel> list = new List<LandmarkModel>();
            if (normalizedLandmarkList == null || normalizedLandmarkList.Landmark.Count == 0) return list;

            for (int i = 0; i < normalizedLandmarkList.Landmark.Count; ++i)
            {
                Vector3 point = rectTransform.GetLocalPosition(normalizedLandmarkList.Landmark[i]);
                list.Add(new LandmarkModel(point.x, point.y, point.z, normalizedLandmarkList.Landmark[i].Visibility, normalizedLandmarkList.Landmark[i].Presence));
            }

            return list;
        }

But getting the Nose(Point) at the bottom of the screen (See little ping ball in the image):

image

When I did Debug.Log(rectTransform.rect); (x:-754.00, y:-565.50, width:1508.00, height:1131.00)

homuler commented 2 years ago

I'm not sure, but maybe this Sphere is not created as a child object of the screen? Note that these methods calculate the local position, not the position in 3D world space. https://github.com/homuler/MediaPipeUnityPlugin/blob/bed2f478eddf3b8be106adcc2044e576e8337e22/Packages/com.github.homuler.mediapipe/Runtime/Scripts/Unity/CoordinateSystem/ImageCoordinate.cs#L271-L282 https://docs.unity3d.com/ScriptReference/Transform-localPosition.html

Arham-Aalam commented 2 years ago

Cool! Making it the child of Annotations Screen worked for me. I also set sphere.transform.localPosition instead of sphere.transform.position

I have one separate question (If you want me to create a new issue then let me know). How to resize Annotatable Screen to portrait mode? actually the previous version was easy because we were able to change the WebCamTexture plane. Thanks.

homuler commented 2 years ago

I have once separate question (If you want me to create a new issue then let me know).

I don't know what exactly you're having trouble with, so please create a new issue and explain the details.