homuler / MediaPipeUnityPlugin

Unity plugin to run MediaPipe
MIT License
1.84k stars 467 forks source link

Graph latency #302

Closed philiprkk closed 3 years ago

philiprkk commented 3 years ago

Hello, Thanks for the big v070 update, it makes certain things easier to use. However I find that the visual landmarks have a great deal of latency compared to before on android, even when setting the camera input resolution down to 640x480. Is there something one can do about this? (It also seems to not be as smooth on PC as it used to).

homuler commented 3 years ago

I don't know what exactly is going on, so can you explain it more concretely by using a video or something to compare the differences before and after the update?

Also, could you please check the behavior in synchronous mode, since it is designed to work asynchronously by default? Screenshot_20211001-162730

philiprkk commented 3 years ago

Here are two recordings of what I mean. I should have mentioned I have only tried with pose tracking specifically.

https://drive.google.com/file/d/1YAVvyYCA6RR9s1eTq2duCHUEbyL3gIPt/view?usp=sharing https://drive.google.com/file/d/1YNlDJ0wK1b7mwBQnjZO4S3IYLZf660b0/view?usp=sharing

homuler commented 3 years ago

I see. I guess it's because WebCamTexture's contents are not copied, but I'll look into it later. https://github.com/homuler/MediaPipeUnityPlugin/blob/75e28732d8630ece85e1b54316e537bd2ae9b320/Assets/Mediapipe/Samples/Scenes/Pose%20Tracking/PoseTrackingSolution.cs#L72

philiprkk commented 3 years ago

Thanks! Looking forward to it 👍

ROBYER1 commented 3 years ago

I see. I guess it's because WebCamTexture's contents are not copied, but I'll look into it later.

https://github.com/homuler/MediaPipeUnityPlugin/blob/75e28732d8630ece85e1b54316e537bd2ae9b320/Assets/Mediapipe/Samples/Scenes/Pose%20Tracking/PoseTrackingSolution.cs#L72

Noticing the same behaviour here in Editor and on android, is there anything we can do on our end to fix this? The same latency is noticeable in editor on Windows vs before the on screen points were much more responsive

ROBYER1 commented 3 years ago

I see. I guess it's because WebCamTexture's contents are not copied, but I'll look into it later.

https://github.com/homuler/MediaPipeUnityPlugin/blob/75e28732d8630ece85e1b54316e537bd2ae9b320/Assets/Mediapipe/Samples/Scenes/Pose%20Tracking/PoseTrackingSolution.cs#L72

I have forked this repo as I am hoping to help out more with this development, @homuler if you are quite busy currently are there any tips or pointers on how to improve this? I will look into it this week and next week to improve the latency.

homuler commented 3 years ago

I haven't investigated the issue yet, but I'm considering two possible causes.

  1. The camera image has advanced while the main thread is being blocked. https://github.com/homuler/MediaPipeUnityPlugin/blob/2994e4425d347260006e15edc8d375e0d35a3b0f/Assets/Mediapipe/Samples/Scenes/Face%20Detection/FaceDetectionSolution.cs#L117-L127 The image is copied at L118 and sent to MediaPipe at L120. In synchronous mode, L125 will block the main thread, but if the screen image (i.e. screen.texture) is updated nevertheless (it's on GPU so I think it can be), it will move forward a few frames when detections is set.

    If this is the case, we need to copy the camera image on CPU and set it to screen.texture in a loop.

  2. It takes actually a few loops to get the result (e.g. detections in above code). In synchronous mode, L125 will block the main thread, but observeTimestampBounds is set to true, an empty packet can be returned in a loop (it cannot happen in most cases, though). https://github.com/homuler/MediaPipeUnityPlugin/blob/2994e4425d347260006e15edc8d375e0d35a3b0f/Assets/Mediapipe/Samples/Scenes/Face%20Detection/FaceDetectionGraph.cs#L41 https://github.com/homuler/MediaPipeUnityPlugin/blob/2994e4425d347260006e15edc8d375e0d35a3b0f/Packages/com.github.homuler.mediapipe/Runtime/Scripts/Framework/CalculatorGraph.cs#L74

    If this is the case, we may need to change graph configs to use PacketPresenceCalculator as before.

Note that the delay in asynchronous mode is a specification, so it is a problem only in synchronous mode.

ROBYER1 commented 3 years ago

Thanks @homuler, this affects the Pose Graph too so I expect if we find a fix it will be across several modules that experience this latency in async mode, we aren't using sync mode. It may be worth updating this issue name to Graph latency as this latency is happening on every platform I tested this week - iOS/Android/Windows/MacOS

ROBYER1 commented 3 years ago

Here are two recordings of what I mean. I should have mentioned I have only tried with pose tracking specifically.

https://drive.google.com/file/d/1YAVvyYCA6RR9s1eTq2duCHUEbyL3gIPt/view?usp=sharing https://drive.google.com/file/d/1YNlDJ0wK1b7mwBQnjZO4S3IYLZf660b0/view?usp=sharing

I will do a bit more testing with the async/sync modes on mobile but I guess it makes sense that the async mode latency is expected due to the nature of the setup there. @philprkk did you notice the same issues with sync mode in use not async?

philiprkk commented 3 years ago

Here are two recordings of what I mean. I should have mentioned I have only tried with pose tracking specifically. https://drive.google.com/file/d/1YAVvyYCA6RR9s1eTq2duCHUEbyL3gIPt/view?usp=sharing https://drive.google.com/file/d/1YNlDJ0wK1b7mwBQnjZO4S3IYLZf660b0/view?usp=sharing

I will do a bit more testing with the async/sync modes on mobile but I guess it makes sense that the async mode latency is expected due to the nature of the setup there. @philprkk did you notice the same issues with sync mode in use not async?

In the 2nd half of the video with the latest plugin, I switched to sync mode, and it still had noticeable latency. The issues I described happened in sync mode.

ROBYER1 commented 3 years ago

Yep I noticed it happens in sync mode too, even if sync needs to hold the main thread up, the latency would be much worse - in the original sample demo the performance appeared to be much better for the speed of pose detection + the camera feed responsiveness

mgarbade commented 3 years ago

I've also noticed and even measured the new delay in milliseconds:

device master-commit or tag latency camera [ms] latency skeleton drawing [ms] total latency [ms]
Galaxy Tab S7 v070 105 ± 20 102±28.4 207
Galaxy Tab S7 (Sep 7, 2021) 92dd555 110±10 23±15.4 133
Galaxy Tab S6 Lite v070 196 ± 31 315±24.1 511
Galaxy Tab S6 Lite (Sep 7, 2021) 92dd555 285±17 107±21.4 392

This was done using a prerecorded video with well defined motion changes, though I did not use a high speed camera, so values might be a bit off or suffer from video compression

ROBYER1 commented 3 years ago

Great investigation thanks for the results, can you confirm these results were from async mode? async mode will definitely have some latency, sync mode however is what we should be looking into improving the latency of as sync should be keeping the camera image in sync with the mediapipe detections - unless something else in the graph is affecting this performance.

I am trying to look into this issue with Homuler as it is blocking us from moving from Mediapipe 0.8.4 with the old samples to at least Mediapipie 0.8.8 with the new sample app (due to a major bugfix that landed in 0.8.8 Mediapipe that fixed an intermittent crash when changing cameras or starting the pose graph).

Any help with looking into this would be appreciated, I have a fork of the repo here where I can invite you to collaborate, feel free to make a new branch to play with a potential fix - https://github.com/ROBYER1/MediaPipeUnityPlugin

philiprkk commented 3 years ago

On my Xperia 5 II I also noticed a similar difference of ~100 ms skeleton drawing latency vs ~20ms before. This is all in sync mode I believe.

ROBYER1 commented 3 years ago

I haven't investigated the issue yet, but I'm considering two possible causes.

  1. The camera image has advanced while the main thread is being blocked. https://github.com/homuler/MediaPipeUnityPlugin/blob/2994e4425d347260006e15edc8d375e0d35a3b0f/Assets/Mediapipe/Samples/Scenes/Face%20Detection/FaceDetectionSolution.cs#L117-L127

    The image is copied at L118 and sent to MediaPipe at L120. In synchronous mode, L125 will block the main thread, but if the screen image (i.e. screen.texture) is updated nevertheless (it's on GPU so I think it can be), it will move forward a few frames when detections is set. If this is the case, we need to copy the camera image on CPU and set it to screen.texture in a loop.

  2. It takes actually a few loops to get the result (e.g. detections in above code). In synchronous mode, L125 will block the main thread, but observeTimestampBounds is set to true, an empty packet can be returned in a loop (it cannot happen in most cases, though). https://github.com/homuler/MediaPipeUnityPlugin/blob/2994e4425d347260006e15edc8d375e0d35a3b0f/Assets/Mediapipe/Samples/Scenes/Face%20Detection/FaceDetectionGraph.cs#L41

    https://github.com/homuler/MediaPipeUnityPlugin/blob/2994e4425d347260006e15edc8d375e0d35a3b0f/Packages/com.github.homuler.mediapipe/Runtime/Scripts/Framework/CalculatorGraph.cs#L74

    If this is the case, we may need to change graph configs to use PacketPresenceCalculator as before.

Note that the delay in asynchronous mode is a specification, so it is a problem only in synchronous mode.

Testing here on an Xperia 1 and iphone 6S same behaviour on both as the comment above

I tried to read the image from CPU in a loop to a rendertexture in editor using the texture frame however I hit a few brick walls with that.

I am now looking into seeing if I can change the Graph configs back to PacketPresenceCalculator, however the new samples code is very different to the old sample code so I'm a bit stuck. Is there any example of this working in the new sample app so I can compare?

ROBYER1 commented 3 years ago

Still having no luck looking into this, for an internal project I am still using the old sample app + mediapipe 0.8.4 for this reason, unfortunately this also means that app experiences the occasional crash due to some values being NAN which was fixed later on in Mediapipe 0.8.8. Hoping there can be a fix for the latency in the newer sample app 👨‍🔧

homuler commented 3 years ago

@philiprkk @ROBYER1 Does the following patch improve the problem? (please ignore any irrelevant issues) This patch changes the behavior of Holistic in sync mode.

diff --git a/Assets/Mediapipe/Samples/Scenes/Holistic/HolisticTrackingSolution.cs b/Assets/Mediapipe/Samples/Scenes/Holistic/HolisticTrackingSolution.cs
index e8e821c..8ddf6cd 100644
--- a/Assets/Mediapipe/Samples/Scenes/Holistic/HolisticTrackingSolution.cs
+++ b/Assets/Mediapipe/Samples/Scenes/Holistic/HolisticTrackingSolution.cs
@@ -92,7 +92,11 @@ namespace Mediapipe.Unity.Holistic
       }
       // NOTE: The _screen will be resized later, keeping the aspect ratio.
       SetupScreen(_screen, imageSource);
-      _screen.texture = imageSource.GetCurrentTexture();
+
+      _screen.texture = runningMode == RunningMode.Async ?
+        imageSource.GetCurrentTexture() :
+        new Texture2D(imageSource.textureWidth, imageSource.textureHeight, TextureFormat.RGBA32, false);
+
       _worldAnnotationArea.localEulerAngles = imageSource.rotation.Reverse().GetEulerAngles();

       Logger.LogInfo(TAG, $"Model Complexity = {modelComplexity}");
@@ -154,6 +158,9 @@ namespace Mediapipe.Unity.Holistic
           _holisticAnnotationController.DrawNow(value.faceLandmarks, value.poseLandmarks, value.leftHandLandmarks, value.rightHandLandmarks);
           _poseWorldLandmarksAnnotationController.DrawNow(value.poseWorldLandmarks);
           _poseRoiAnnotationController.DrawNow(value.poseRoi);
+
+          // TODO: copy texture before `textureFrame` is released
+          textureFrame.CopyTexture(_screen.texture);
         }

         yield return new WaitForEndOfFrame();
ROBYER1 commented 3 years ago

I applied this to PoseTrackingSolution.cs which we are using and it now keeps the webcam and pose points in sync, thank you! The only thing we notice now when comparing the old sample app vs new is the actual framerate of the webcam + pose model is a lot lower than before which creates perceived latency vs the old sample app. Is there anything you would suggest we can do to bring the speed of the camera back to the old sample app speeds or is mediapipe 0.8.6 onwards just slower to run?

I would also suggest to add yields around these as sometimes in editor on start of the camera, the image would still be out of sync if it doesn't yield -

            SetupScreen(_screen, imageSource);
            yield return null;

            //_screen.texture = imageSource.GetCurrentTexture();
            _screen.texture = runningMode == RunningMode.Async ? imageSource.GetCurrentTexture() : new Texture2D(imageSource.textureWidth, imageSource.textureHeight, TextureFormat.RGBA32, false);
            yield return null;
homuler commented 3 years ago

First, let me clarify that the sample code and the plugin code are independent. Although you need to modify some code to resolve compile/runtime errors (some APIs have changed), you can upgrade the plugin version while keeping using old sample code.

Is there anything you would suggest we can do to bring the speed of the camera back to the old sample app speeds or is mediapipe 0.8.6 onwards just slower to run?

I can't see any obvious performance issues with the device I have, so if you can profile your app and tell me where it's slow, I'll check. However, note that the code above is intentionally written to perform slow.

Before profiling, there are two things that I can say:

  1. If you're using full or heavy model, then switch to lite model.
  2. The input image of the new sample has about 3 - 4 times as many pixels as that of the old sample by default (i.e. 1280x720 vs 640x480). Reducing the input image size may improve the performance.

I would also suggest to add yields around these as sometimes in editor on start of the camera, the image would still be out of sync if it doesn't yield -

Will you theoretically explain to me what this code will fix and how?

ROBYER1 commented 3 years ago

Thanks for the response here @homuler , after further investigation I believe your fix worked great and the performance in sync mode is now as expected. I found we had a Unity editor bug which was causing the webcam being out of sync on start and those yields in fact did nothing when we resolved the editor version issue. I am not too sure why it worked but it had something to do with the initialization of the webcam device in the editor only.

Please do merge the sync mode fix in when you are happy with it as this should solve that latency issue for Pose and Holistic if included in the pose and holistic solution scripts + any others that are affected by this.

Thanks again for your hard work, I have been testing on Editor Windows/Mac + builds on Android/Windows/IOS so apologies for any mix up on the performance as the editor is a rule unto itself sometimes it seems!

homuler commented 3 years ago

I found we had a Unity editor bug which was causing the webcam being out of sync on start and those yields in fact did nothing when we resolved the editor version issue. I am not too sure why it worked but it had something to do with the initialization of the webcam device in the editor only.

I think I can confirm a similar phenomenon on Android, but at least it is not fixed by yield return null, so I'll leave it as is. If it really fixes the problem, then I guess we actually have to wait for something, but I don't know what it is.

ROBYER1 commented 2 years ago

Another interesting finding I found is (this is very windows laptop specific), using the new sample app on a HP Zbook laptop, if the laptop is unplugged from power then the camera throttles its framerate If the laptop is plugged in it is fine. If we do the same with the older sample app this phenomenon doesn't occur, I haven't had time to investigate as I just keep the laptop plugged in always but it is strange it only does it with the newer mediapipe + sample app, probably something on the windows compiled side in Mediapipe 0.8.8.

Thanks again for fixing this!

homuler commented 2 years ago

probably something on the windows compiled side in Mediapipe 0.8.8.

I don't think so. If so, you can confirm it using new version with the old sample code (https://github.com/homuler/MediaPipeUnityPlugin/issues/302#issuecomment-979645232), but I think it's rather due to an implementation difference on the Unity side. With the previous app, webcam image was copied to Texture every time inference was done, so although the performance was terrible, it was perfectly in sync with the annotation.

If it is a serious problem, please share a video or so that shows the situation.