Multiple pose not working on iOS Camera while ok on Editor Mac

fredmouse commented 3 weeks ago

Plugin Version or Commit ID

v.0.14.1 & v.0.14.3

Unity Version

2022.3.13f1

Your Host OS

macOS Ventura

Target Platform

iOS

Target Device

iPhone 15 iOS 17.5.1 and other iPhone

[Windows Only] Visual Studio C++ and Windows SDK Version

No response

[Linux Only] GCC/G++ and GLIBC Version

No response

[Android Only] Android Build Tools and NDK Version

No response

[iOS Only] XCode Version

15.1

Build Command

MediaPipeUnityPlugin-all-stripped project

Bug Description

When Image source is set to web camera, it shows random shape like in image attached on multiple poses and works ok on unity Editor. However if image source is set to video, it works just fine both on unity Editor and iOS Device.

Steps to Reproduce the Bug

Set Default Image Source in App Settings to Web Camera
Install Scene Tasks/Pose Landmark Detection to iOS device
Change Num Poses to 2 in Configuration (or Change NumPoses > 2 in Code before Installing to Device)
Pose landmark shows random shape like in image attached

Log

no log

Screenshot/Video

Additional Context

No response

homuler commented 3 weeks ago

If there's an issue with the sample app, it's possible that the input image isn't being sent to MediaPipe in the correct orientation. If you change the values of flipHorizontally, flipVertically or rotationAngle, could the accuracy improve? https://github.com/homuler/MediaPipeUnityPlugin/blob/bd417bfd463d2da9cae0d9f9f7d29dd9ef4fabe6/Assets/MediaPipeUnity/Samples/Scenes/Tasks/Pose%20Landmark%20Detection/PoseLandmarkerRunner.cs#L66-L68

I've experienced delays in detecting poses before the tracking begins (once the pose is detected correctly, the pose will be tracked and the landmark position will stop flickering), so it's possible that this issue is occurring (Since this is a MediaPipe-related issue, there's not much I can do about it).

fredmouse commented 3 weeks ago

I think I have to add that everything works fine when NumPoses is set to 1 including camera input on iOS device. So I'm afraid it's has nothing to do with input image or flipping settings unless something internal works differently when it comes to NumPoses>1. In case of misunderstanding , the above image attached is camera feed of iPhone to a video shown on Mac screen. So my point is that only changing NumPoses from 1 to 2 makes it from working normally on camera feed to flickering tracking on iOS device.

homuler commented 3 weeks ago

I see. In that case, I believe it's an issue with MediaPipe itself, so I would recommend asking the MediaPipe development team about it. Perhaps adjusting the value of minPoseDetectionConfidence might make it more stable, but I'm not sure.

https://github.com/homuler/MediaPipeUnityPlugin/blob/bd417bfd463d2da9cae0d9f9f7d29dd9ef4fabe6/Packages/com.github.homuler.mediapipe/Runtime/Scripts/Tasks/Vision/PoseLandmarker/PoseLandmarkerOptions.cs#L52-L55

fredmouse commented 3 weeks ago

I seemingly do find some leads on this issue from Mediapipe forum. https://github.com/google-ai-edge/mediapipe/issues/5273 It looks like the same issue I met in pose landmarker on iOS. And they provide new example of pose landmarker for iOS as a way of bypass the problem. I have tried the latest pose landmarker example and it works fine on multiple poses for iOS. However since the new example was added for the first time in March from history, I shall assume that your pose example in tasks was not based on this new example. Should I be looking forward to any revision on this in the near future?=)

homuler commented 3 weeks ago

I shall assume that your pose example in tasks was not based on this new example.

The sample code seems to be just a port of the Task API to Swift, so I couldn't identify any significant differences.

Could you check if both samples are running with the same parameters? Specifically, I would like you to confirm the model being used (i.e. lite, full, heavy), runningMode (i.e. image, video, live_stream), delegate (i.e. CPU, GPU), and the confidence parameters.

fredmouse commented 3 weeks ago

I have checked that both samples run in live_stream mode with GPU using lite and the confidence parameters are all set to 0.5. And the Mediapipe official iOS Sample works ok with both GPU and CPU and mode can be lite, full or heavy.

homuler commented 3 weeks ago

If possible, could you check if the official sample code runs when ported to MediaPipe version v0.10.9? I want to know if upgrading the MediaPipe version (v0.10.14) would fix the issue.

Also, does changing the Delegate to CPU affect the results? https://github.com/homuler/MediaPipeUnityPlugin/blob/bd417bfd463d2da9cae0d9f9f7d29dd9ef4fabe6/Assets/MediaPipeUnity/Samples/Scenes/Tasks/Pose%20Landmark%20Detection/PoseLandmarkDetectionConfig.cs#L22-L27

fredmouse commented 2 weeks ago

I have tried running the official sample on v0.10.9 but failed. I used pod install with version v0.10.9 and got some error in building. From history, the new official sample begun in March on v0.10.12 and now is updated to v0.10.15. Maybe it's not compatible with v0.10.9 at all.

homuler commented 2 weeks ago

Also, does changing the Delegate to CPU affect the results?

@fredmouse Can you confirm this? I'd like to know if just upgrading MediaPipe may resolve the issue.

homuler commented 1 week ago

@fredmouse Can you check if the issue is resolved by v0.14.4? https://github.com/homuler/MediaPipeUnityPlugin/releases/tag/v0.14.4

fredmouse commented 1 week ago

Tried with v0.14.4 and the issue is still there. We didn't get any progress on this last week trying different optional ways including changing Delegate to CPU. So we were not fully all over it and onto something else, sorry for the late reply. Weird thing is that only NumPoses>1 on iOS Camera brings up the issue otherwise it works perfectly. We are trying figure it out any difference NumPoses 1 to 2 could make in code. Could you give us some lead on this?

homuler commented 1 week ago

I suspect that the orientation of the input image might not be correct after all (https://github.com/homuler/MediaPipeUnityPlugin/issues/1196#issuecomment-2168131504). When using the front camera (or the back camera), if you open the scene with the device (and the app) rotated to the right (left if using the back camera), doesn't inference stabilize even when NumPoses >= 2? (Note that you need to (re)open the scene after rotating the device)

homuler commented 1 week ago

It seems that this issue occurs when the input image is rotated. This is not limited to iOS; the same issue can be reproduced in UnityEditor by specifying WebCamSource.rotation.

diff --git a/Assets/MediaPipeUnity/Samples/Common/Scripts/ImageSource/WebCamSource.cs b/Assets/MediaPipeUnity/Samples/Common/Scripts/ImageSource/WebCamSource.cs
index 0e17567..00dfc24 100644
--- a/Assets/MediaPipeUnity/Samples/Common/Scripts/ImageSource/WebCamSource.cs
+++ b/Assets/MediaPipeUnity/Samples/Common/Scripts/ImageSource/WebCamSource.cs
@@ -52,7 +52,7 @@ namespace Mediapipe.Unity

     public override bool isVerticallyFlipped => isPrepared && webCamTexture.videoVerticallyMirrored;
     public override bool isFrontFacing => isPrepared && (webCamDevice is WebCamDevice valueOfWebCamDevice) && valueOfWebCamDevice.isFrontFacing;
-    public override RotationAngle rotation => !isPrepared ? RotationAngle.Rotation0 : (RotationAngle)webCamTexture.videoRotationAngle;
+    public override RotationAngle rotation => !isPrepared ? RotationAngle.Rotation0 : RotationAngle.Rotation90;

     private WebCamDevice? _webCamDevice;
     private WebCamDevice? webCamDevice

Additionally, I tried the official sample, and it appeared to have the same problem (it cannot detect correctly unless the device is held upright when the app is launched).

As a workaround, specifically for PoseLandmarker, setting the rotationDegrees to 0 at all times seems to stabilize the output.

diff --git a/Assets/MediaPipeUnity/Samples/Scenes/Tasks/Pose Landmark Detection/PoseLandmarkerRunner.cs b/Assets/MediaPipeUnity/Samples/Scenes/Tasks/Pose Landmark Detection/PoseLandmarkerRunner.cs
index ceebe85..6fc7a9d 100644
--- a/Assets/MediaPipeUnity/Samples/Scenes/Tasks/Pose Landmark Detection/PoseLandmarkerRunner.cs        
+++ b/Assets/MediaPipeUnity/Samples/Scenes/Tasks/Pose Landmark Detection/PoseLandmarkerRunner.cs        
@@ -65,7 +65,7 @@ namespace Mediapipe.Unity.Sample.PoseLandmarkDetection
       var transformationOptions = imageSource.GetTransformationOptions();
       var flipHorizontally = transformationOptions.flipHorizontally;
       var flipVertically = transformationOptions.flipVertically;
-      var imageProcessingOptions = new Tasks.Vision.Core.ImageProcessingOptions(rotationDegrees: (int)transformationOptions.rotationAngle);
+      var imageProcessingOptions = new Tasks.Vision.Core.ImageProcessingOptions(rotationDegrees: 0);

       AsyncGPUReadbackRequest req = default;
       var waitUntilReqDone = new WaitUntil(() => req.done);

fredmouse commented 6 days ago

yes, this solves the problem. Thank you for your help.

homuler / MediaPipeUnityPlugin