Open Paxios opened 2 years ago
Please don't leave the Code to Reproduce the issue
field blank.
My goal is to skip copying data from GPU to CPU and then passing it back to GPU (MP). It is a huge bottleneck on older devices.
DemoGraph
is a very old implementation. See https://github.com/homuler/MediaPipeUnityPlugin/issues/435#issuecomment-1022752459 instead.
I looked around the repo, but I couldn't figure it out, if there is a support for Metal API? If not, is it planned to be added in the future?
What kind of support do you expect? At least, you can use it as the graphics API.
Sorry for a late response, I was trying to implement your comment.
In the Estimator class, I added 2 logs, so it's visible which part of the code still gets executed and which does not. There is no exception thrown or anything. I never receive a result from the graph. Is there something I'm doing wrong?
Below is the relevant code: CameraManager:
var selectedWebCam = WebCamTexture.devices[0];
WebCamTexture = new WebCamTexture(selectedWebCam.name, requestedHeight: 160, requestedWidth: 160);
WebCamTexture.Play();
EstimationManager (MonoBehaviour):
private void Update(){
Estimator.MakeEstimation(CameraManager.WebCamTexture.width, CameraManager.WebCamTexture.height, CameraManager.WebCamTexture);
}
Estimator:
public void MakeEstimation(int width, int height, WebCamTexture texture){
if (texture == null || width < 100)
return;
TextureFramePool.ResizeTexture(width, height, TextureFormat.RGBA32);
if (!TextureFramePool.TryGetTextureFrame(out var textureFrame))
return;
textureFrame.ReadTextureFromOnGPU(texture);
//I also tried with TextureFrame#ReadTxtureFromOnCPU(texture)
var gpuBuffer = textureFrame.BuildGpuBuffer(GpuManager.GlCalculatorHelper.GetGlContext());
Debug.Log("This is logged");
Graph.AddPacketToInputStream("input_video", new GpuBufferPacket(gpuBuffer, new Timestamp(currentMicroSeconds))).AssertOk();
if (_outputLandmarksStream.TryGetNext(out var landmarkList)) {
Debug.Log("This is NOT logged");
[...]
}
}
Graph that I use:
input_stream: "input_video"
output_stream: "pose_landmarks"
output_stream: "pose_world_landmarks"
node {
calculator: "FlowLimiterCalculator"
input_stream: "input_video"
input_stream: "FINISHED:pose_landmarks"
input_stream_info: {
tag_index: "FINISHED"
back_edge: true
}
output_stream: "throttled_input_video"
}
node: {
calculator: "ImageTransformationCalculator"
input_stream: "IMAGE_GPU:throttled_input_video"
input_side_packet: "ROTATION_DEGREES:input_rotation"
input_side_packet: "FLIP_HORIZONTALLY:input_horizontally_flipped"
input_side_packet: "FLIP_VERTICALLY:input_vertically_flipped"
output_stream: "IMAGE_GPU:transformed_input_video"
}
node {
calculator: "PoseLandmarkGpu"
input_stream: "IMAGE:transformed_input_video"
input_side_packet: "MODEL_COMPLEXITY:model_complexity"
input_side_packet: "SMOOTH_LANDMARKS:smooth_landmarks"
input_side_packet: "ENABLE_SEGMENTATION:enable_segmentation"
input_side_packet: "SMOOTH_SEGMENTATION:smooth_segmentation"
output_stream: "LANDMARKS:pose_landmarks"
output_stream: "WORLD_LANDMARKS:pose_world_landmarks"
}
If I use TextureFromCamera.SetPixels32(WebCamTexture.GetPixels32());
and create new ImageFrame
from it with:
var imageFrame = new ImageFrame(ImageFormat.Types.Format.Srgba, TextureFromCamera.width, TextureFromCamera.height, TextureFromCamera.width * 4, TextureFromCamera.GetRawTextureData<byte>());
then pass this ImageFrame to the graph it works (obviously I change the graph to expect ImageFrame instead of GpuBuffer).
For Metal support I meant, if it's possible to pass Metal's pointer to MediaPipe like we're doing with GLES. Reason for tihs is that we do not want to pass texture from GPU to CPU and back to GPU.
textureFrame.ReadTextureFromOnGPU(texture);
What is the return value? (if it returned false
, it means it failed)
For Metal support I meant, if it's possible to pass Metal's pointer to MediaPipe like we're doing with GLES. Reason for tihs is that we do not want to pass texture from GPU to CPU and back to GPU.
I am aware that this problem exists, and I'd like to implement the feature if I had unlimited time, but it's not really a high priority because I'm not sure if it really improves the plugin's performance (if the sample app runs at 60fps and the inference step takes less than 1/60 sec, it may not make many differences if any). If you can demonstrate that there's really a performance hit in that area (e.g. it performs worse than the official iOS sample app), I think the priority will be high.
textureFrame.ReadTextureFromOnGPU(texture);
returns true
.
I haven't yet tried the app on newer iOS devices, so it's possible that Metal support for this won't be possible as you said 😄.
Do you maybe have/use some community channel like discord group?
textureFrame.ReadTextureFromOnGPU(texture);
returnstrue
.
Hmm, I don't know. On my Android device, it certainly works when I applied the below patch.
diff --git a/Assets/MediaPipeUnity/Samples/Common/Scripts/Solution.cs b/Assets/MediaPipeUnity/Samples/Common/Scripts/Solution.cs
index 813c66a..a6f4322 100644
--- a/Assets/MediaPipeUnity/Samples/Common/Scripts/Solution.cs
+++ b/Assets/MediaPipeUnity/Samples/Common/Scripts/Solution.cs
@@ -76,7 +76,7 @@ namespace Mediapipe.Unity
if (textureType == typeof(WebCamTexture))
{
- textureFrame.ReadTextureFromOnCPU((WebCamTexture)sourceTexture);
+ textureFrame.ReadTextureFromOnGPU((WebCamTexture)sourceTexture);
}
else if (textureType == typeof(Texture2D))
{
The possible reasons I can come up with are:
WebCamTexture
format is not ARGB32
and the channels of the converted image are not aligned as MediaPipe expects.so it's possible that Metal support for this won't be possible as you said
To be precise, supporting Metal itself is possible, but it's not a high priority for now.
Do you maybe have/use some community channel like discord group?
No, I don't.
Settings are shown below, should I maybe set any of the ES version to be required?
I will experiment with WebCamTexture's format and gles context.
By the way, TextureFormat of the camera is "R8G8B8A8_UNorm". So I guess this could be the problem, since it's RGBA32 instead of ARGB32?
Also a note, that ReadTextureFromOnCPU
doesn't work either, so I guess there's something wrong with my implementation 😄
Settings are shown below, should I maybe set any of the ES version to be required?
OpenGL ES 3.2 is required to share the context with MediaPipe (that's why even ReadTextureFromOnCPU
doesn't work).
I strongly recommend you first modify and test the sample app before writing your own code.
Hey there once more,
I looked into the code more deeply than before and I can't find the usage of ReadTextureFromOnGPU
in the sample app. Is it because of the high latency, you mentioned in the comment?
If I switch from ReadTextureFromOnCPU
to ReadTextureFromOnGPU
in Solution
class, no estimations are returned from the graph (as it happens in my app).
I would very much appreciate it, if you could please verify that ReadTextureFromOnGPU works on your side.
I tested it on 2 GPUs ARM Mali-G71 MP20 and Xclipse 920, which both support GL ES 3.2
I would very much appreciate it, if you could please verify that ReadTextureFromOnGPU works on your side.
I confirmed (see https://github.com/homuler/MediaPipeUnityPlugin/issues/768#issuecomment-1279975261).
I think you should display the target texture (after calling ReadTextureFromOnGPU
) on the screen first.
I would very much appreciate it, if you could please verify that ReadTextureFromOnGPU works on your side.
I confirmed (see #768 (comment)).
Mind if I ask which device you tried it on or which GPU it uses ?
Pixel 6. At any rate, I think you should check if the pixel data is actually copied on GPU (cf. https://github.com/homuler/MediaPipeUnityPlugin/issues/768#issuecomment-1282360996).
Sorry for a late response.
textureFrame.ReadTextureFromOnGPU(texture);
texture2DToDisplay.SetPixels32(textureFrame.GetPixels32());
texture2DToDisplay.Apply();
RawImage.texture = texture2DToDisplay;
texture
is WebCamTexture
. This does actually display the correct image in the RawImage on the screen.
The following is the process of applying TextureFrame
to the graph.
var gpuBuffer = textureFrame.BuildGpuBuffer(GpuManager.GlCalculatorHelper.GetGlContext());
_graph.AddPacketToInputStream("input_video", new GpuBufferPacket(gpuBuffer, new Timestamp(currentMicroSeconds))).AssertOk();
Output processing:
_outputLandmarksStream = new OutputStream<LandmarkListPacket, LandmarkList>(_graph, OutputNodeName);
_outputLandmarksStream.AddListener(OnPoseLandmarksOutput);
private void OnPoseWorldLandmarksOutput(object stream, OutputEventArgs<LandmarkList> eventArgs) {
if (eventArgs.value != null) {
...
I use eventArgs.value in here
...
}
}
This code works if I change ReadTextureFromOnGPU
to ReadTextureFromOnCPU
.
Which log is output on your device? (maybe you need to build your apk with Development Build
checked).
https://github.com/homuler/MediaPipeUnityPlugin/blob/391d7d98b127ce41ceac85ec47f6126664f1bc4e/Packages/com.github.homuler.mediapipe/Runtime/Scripts/Unity/GpuManager.cs#L78-L87
This does actually display the correct image in the RawImage on the screen.
Hmm, interesting. I guess this code wouldn't work on my Pixel 6 rather (I need to do RawImage.texture = textureFrame._texture
).
In general, Graphics.CopyTexture
only works on GPU. When I used Unity 2020.3.x, the data on CPU had been invalidated after calling Graphics.CopyTexture
(cf. https://forum.unity.com/threads/graphics-copytexture-then-getpixels.482601/) (I've not tested it yet with Unity 2021.3.x),
Hey, I just tried modifying your code as follows:
if (textureType == typeof(WebCamTexture))
{
textureFrame.ReadTextureFromOnCPU((WebCamTexture)sourceTexture);
if(textureFrame._texture != null)
rawImage.texture = textureFrame._texture;
}
This does show the camera preview on the screen. But if I use ReadTextureFromOnGPU
it doesn't. So I guess there's some problem setting _texture
in TextureFrame
. ReadTextureFromOnGPU returns true
, so I don't know what would cause this.
Which log is output on your device? (maybe you need to build your apk with Development Build checked).
Output is the following: Unity GpuManager: EGL context is found: 511835274624
Some additional information:
I'm using Unity 2022.2.0 (as per your advice in https://github.com/homuler/MediaPipeUnityPlugin/issues/760).
ReadTextureFromOnGPU
uses Graphics.CopyTexture
.
I checked additional data in ReadTextureFromOnGPU
srcFormat (WebCamTexure): R8G8B8A8_UNorm
thisFormat (Texture2D): RGBA32
Width & Height match on both
SystemInfo.copyTextureSupport
returns: Basic, Copy3D, DifferentTypes, TextureToRT, RTToTexture
I have also tested your sample app on Pixel 6 with ReadTextureFromOnGPU set and it's the same outcome.
I managed to find out the cause of this issue 😅
https://github.com/homuler/MediaPipeUnityPlugin/blob/6b8c6743f23539f7604e74dc260b01e0f58f1707/Assets/MediaPipeUnity/Samples/Common/Scripts/ImageSourceSolution.cs#L71-L73
I had to change the format of the pool from TextureFormat.RGBA32
to TextureFormat.ARGB32
. I think there's a typo in the comment :)
Also this works on GLES 3.1, so 3.2 is not mandatory
I had to change the format of the pool from
TextureFormat.RGBA32
toTextureFormat.ARGB32
. I think there's a typo in the comment :)
So it seems that the cause was:
Your device's
WebCamTexture
format is notARGB32
and the channels of the converted image are not aligned as MediaPipe expects.
The following comment is not a typo (cf. https://github.com/google/mediapipe/blob/7a6ae97a0ef298014aaf5e1370cb6f8237f2ac21/mediapipe/gpu/gpu_buffer_format.cc#L64-L78).
// When using GpuBuffer, MediaPipe assumes that the input format is BGRA, so the following code must be fixed.
However, at least in Unity, this assumption does not always hold (the input format can be RGBA or ARGB, etc...). Currently, this issue can be avoided by changing the texture format as you did (but it's not intuitive which format should be used).
Also this works on GLES 3.1, so 3.2 is not mandatory
Indeed, I was wrong about this, and it seems that OpenGL ES 3.2 is not required to create a context. https://github.com/google/mediapipe/blob/7a6ae97a0ef298014aaf5e1370cb6f8237f2ac21/mediapipe/gpu/gl_context_egl.cc#L110-L171
So it seems that the cause was:
Yes, that was the cause :)
The following comment is not a typo
Ah okay 👍🏼, wasn't sure.
Is there any way to not block the CPU while TextureFramePool
executes TextureFrame#WaitUntilReleased
?
https://github.com/homuler/MediaPipeUnityPlugin/blob/6b8c6743f23539f7604e74dc260b01e0f58f1707/Assets/MediaPipeUnity/Samples/Common/Scripts/ImageSource/TextureFramePool.cs#L121
https://github.com/homuler/MediaPipeUnityPlugin/blob/6b8c6743f23539f7604e74dc260b01e0f58f1707/Assets/MediaPipeUnity/Samples/Common/Scripts/ImageSource/TextureFrame.cs#L268-L281
Or will this unsync the GPU & CPU and result in uncontrollable crashes?
I'd like to achieve relatively smooth performance on old devices (e.g. Samsung Galaxy J7). Currently this on average causes 90-100 ms of lag.
Currently this on average causes 90-100 ms of lag.
Do you mean _glSyncToken.Wait()
takes 90-100ms?
If so, how did you measure it?
Currently this on average causes 90-100 ms of lag.
Do you mean
_glSyncToken.Wait()
takes 90-100ms? If so, how did you measure it?
Yes, that's correct. With Unity deep profiling.
It's far less on modern devices (10-25ms)
Does changing TextureFramePool._poolSize
(e.g. 20) make any differences?
https://github.com/homuler/MediaPipeUnityPlugin/blob/391d7d98b127ce41ceac85ec47f6126664f1bc4e/Assets/MediaPipeUnity/Samples/Common/Scripts/ImageSource/TextureFramePool.cs#L18
No, not at all.
It just delays it a bit. I once set it to 10000 (changed the GlobalInstanceTable
size) and that delay, didn't happen. But the game crashed. I guess because not enough resources were available for 10k textures.
I am also seeing very slow results on my device (Android Galaxy 04, built with SDK28 and minVersion set to OpenGL ES 3.1).
Do you have any ideas why the latency is high when copied to GPU as stated in your comment?
// For some reason, when the image is coiped on GPU, latency tends to be high.
Profiler screenshot from running the Hands tracking sample on Android shows that it does take time to read image from GPU.
@tealm I have a same question. Do you get the answer?... Thank you for checking my message.
Plugin Version or Commit ID
v0.10.1
Unity Version
2022.2
Your Host OS
macOS Monterey 12.6
Target Platform
Android, iOS
Description
1. Is it possible to use GpuBuffer with WebCamTexture?
I tried creating GpuBuffer with Texture.GetNativeTexturePtr, It compiled fine and the app never crashed. TryGetNext is never true, feels like the result is never provided. There is also no exception thrown, I had Logcat is attached to the device with mediapipe dbg build. I used https://github.com/homuler/MediaPipeUnityPlugin/wiki/API-Overview#gpubuffer , https://github.com/homuler/MediaPipeUnityPlugin/blob/92cad4bbe9ba52514034c52ac5b5f0a99accab06/Assets/Mediapipe/Samples/Scripts/DemoGraph.cs#L99-L112 and https://github.com/homuler/MediaPipeUnityPlugin/blob/18e00a85ac271b123178e593184184e8715ed22e/Assets/MediaPipe/Examples/Scripts/DemoGraph.cs#L54-L63 https://github.com/homuler/MediaPipeUnityPlugin/issues/13 as a reference.
I saw that there are different texture formats (SRGBA from WebCamTexture and BGRA for GpuBuffer), could this be the problem?https://github.com/homuler/MediaPipeUnityPlugin/blob/6b8c6743f23539f7604e74dc260b01e0f58f1707/Assets/MediaPipeUnity/Samples/Common/Scripts/ImageSourceSolution.cs#L72
My goal is to skip copying data from GPU to CPU and then passing it back to GPU (MP). It is a huge bottleneck on older devices. I also tried reading the texture of WebCamTexture from a thread, but it can only be read from the main thread. Calculator I used: https://github.com/homuler/MediaPipeUnityPlugin/blob/6b8c6743f23539f7604e74dc260b01e0f58f1707/Assets/MediaPipeUnity/Samples/Scenes/Pose%20Tracking/pose_tracking_opengles.txt
2. Support for Metal API
I looked around the repo, but I couldn't figure it out, if there is a support for Metal API? If not, is it planned to be added in the future?
Code to Reproduce the issue
No response
Additional Context
No response