google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
27.64k stars 5.17k forks source link

mediapipe not GPU accelerated #5742

Open jpitalopez opened 3 hours ago

jpitalopez commented 3 hours ago

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

None

OS Platform and Distribution

Linnux Ubuntu

MediaPipe Tasks SDK version

0.10.15

Task name (e.g. Image classification, Gesture recognition etc.)

Face detection

Programming Language and version (e.g. C++, Python, Java)

Python

Describe the actual behavior

Executes on CPU

Describe the expected behaviour

Executes on GPU

Standalone code/steps you may have used to try to get what you need

I just installed mediapipe with: $ pip install mediapie. I already installed cuda, pytroch and tensorflow.

Other info / Complete Logs

2024-11-19 11:01:06.965759: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-11-19 11:01:06.977470: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-11-19 11:01:06.989562: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-11-19 11:01:06.992947: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-11-19 11:01:07.002094: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-11-19 11:01:07.544102: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1732010468.689813   20482 gl_context_egl.cc:85] Successfully initialized EGL. Major : 1 Minor: 5
I0000 00:00:1732010468.691421   20554 gl_context.cc:357] GL version: 3.2 (OpenGL ES 3.2 Mesa 24.0.9-0ubuntu0.2), renderer: Mesa Intel(R) UHD Graphics (TGL GT1)
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
kuaashish commented 3 hours ago

Hi @jpitalopez,

To enable GPU acceleration, please refer to this example notebook and update the code as shown below to delegate processing to the GPU:

base_options = mp.tasks.BaseOptions(
    model_asset_path='detector.tflite',
    delegate=mp.tasks.BaseOptions.Delegate.GPU
)

Kindly let us know if you can now run the sample on the GPU successfully.

Thank you!!

jpitalopez commented 1 hour ago

2024-11-19 13:03:10.441315: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2024-11-19 13:03:10.451005: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-11-19 13:03:10.463976: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-11-19 13:03:10.467295: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-11-19 13:03:10.476254: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-11-19 13:03:11.010236: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1732017791.975393 26675 gl_context_egl.cc:85] Successfully initialized EGL. Major : 1 Minor: 5 I0000 00:00:1732017791.977027 26768 gl_context.cc:357] GL version: 3.2 (OpenGL ES 3.2 Mesa 24.0.9-0ubuntu0.2), renderer: Mesa Intel(R) UHD Graphics (TGL GT1) INFO: Created TensorFlow Lite delegate for GPU.


The output is this one above. Is good that now it creates TensorFlow Lite delegate for GPU. But it seems that the device is using is part of the processor instead of the graphics card. I have Nvidia RTX 3050.