google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
27.39k stars 5.15k forks source link

Building for Python 3.8 with GPU support for NVIDIA Jetson Orin Nano #5106

Open bigger-py opened 9 months ago

bigger-py commented 9 months ago

OS Platform and Distribution

Linux Ubuntu 20.04 (Jetpack 5.1.1)

Compiler version

No response

Programming Language and version

Python 3.8

Installed using virtualenv? pip? Conda?(if python)

No response

MediaPipe version

0.10.10

Bazel version

7.0.2

XCode and Tulsi versions (if iOS)

No response

Android SDK and NDK versions (if android)

No response

Android AAR (if android)

None

OpenCV version (if running on desktop)

No response

Describe the problem

I am trying to build a Mediapipe wheel for Python that will allow me to use the GPU on my NVIDIA Jetson Orin Nano (8GB). To my knowledge, there is no official documentation explaining this process. From reading around, I've found instructions here: https://github.com/jiuqiant/mediapipe_python_aarch64/blob/main/README.md, though they appear to be outdated. I followed these steps but have encountered the following issue. I'm not sure how to proceed next.

Complete Logs

...
WARNING: /home/nvidia/mediapipe/mediapipe/framework/tool/BUILD:200:24: in cc_library rule //mediapipe/framework/tool:field_data_cc_proto: target '//mediapipe/framework/tool:field_data_cc_proto' depends on deprecated target '@com_google_protobuf//:cc_wkt_protos': Only for backward compatibility. Do not use.
WARNING: /home/nvidia/mediapipe/mediapipe/framework/BUILD:58:24: in cc_library rule //mediapipe/framework:calculator_cc_proto: target '//mediapipe/framework:calculator_cc_proto' depends on deprecated target '@com_google_protobuf//:cc_wkt_protos': Only for backward compatibility. Do not use.
ERROR: /home/nvidia/.cache/bazel/_bazel_nvidia/ff4425722229fc486cc849b5677abe3f/external/rules_foreign_cc/toolchains/BUILD.bazel:192:11: in ninja_tool rule @rules_foreign_cc//toolchains:ninja_tool: 
Traceback (most recent call last):
    File "/home/nvidia/.cache/bazel/_bazel_nvidia/ff4425722229fc486cc849b5677abe3f/external/rules_foreign_cc/foreign_cc/built_tools/ninja_build.bzl", line 16, column 30, in _ninja_tool_impl
        additional_tools = depset(
Error in depset: at index 0 of transitive, got element of type NoneType, want depset
ERROR: /home/nvidia/.cache/bazel/_bazel_nvidia/ff4425722229fc486cc849b5677abe3f/external/rules_foreign_cc/toolchains/BUILD.bazel:192:11: Analysis of target '@rules_foreign_cc//toolchains:ninja_tool' failed
INFO: Repository gnumake_src instantiated at:
  /home/nvidia/mediapipe/WORKSPACE:54:30: in <toplevel>
  /home/nvidia/.cache/bazel/_bazel_nvidia/ff4425722229fc486cc849b5677abe3f/external/rules_foreign_cc/foreign_cc/repositories.bzl:65:25: in rules_foreign_cc_dependencies
  /home/nvidia/.cache/bazel/_bazel_nvidia/ff4425722229fc486cc849b5677abe3f/external/rules_foreign_cc/toolchains/built_toolchains.bzl:37:20: in built_toolchains
  /home/nvidia/.cache/bazel/_bazel_nvidia/ff4425722229fc486cc849b5677abe3f/external/rules_foreign_cc/toolchains/built_toolchains.bzl:420:14: in _make_toolchain
  /home/nvidia/.cache/bazel/_bazel_nvidia/ff4425722229fc486cc849b5677abe3f/external/bazel_tools/tools/build_defs/repo/utils.bzl:233:18: in maybe
Repository rule http_archive defined at:
  /home/nvidia/.cache/bazel/_bazel_nvidia/ff4425722229fc486cc849b5677abe3f/external/bazel_tools/tools/build_defs/repo/http.bzl:372:31: in <toplevel>
ERROR: Analysis of target '//mediapipe/modules/face_detection:face_detection_short_range_cpu.binarypb' failed; build aborted: 
INFO: Elapsed time: 1.847s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (142 packages loaded, 2533 targets configured)
    Fetching https://mirror.bazel.build/ftpmirror.gnu.org/gnu/make/make-4.3.tar.gz
Command '['bazel', 'build', '--compilation_mode=opt', '--copt=-DNDEBUG', '--action_env=PYTHON_BIN_PATH=/usr/bin/python3', 'mediapipe/modules/face_detection/face_detection_short_range_cpu.binarypb', '--define=MEDIAPIPE_DISABLE_GPU=1', '--define=OPENCV=source']' returned non-zero exit status 1.
kuaashish commented 9 months ago

Hi @bigger-py,

Currently, MediaPipe does not offer support for NVIDIA Jetson, and there is no available documentation for this platform. The only supported IoT device is the 64-bit Raspberry Pi, as indicated in the documentation found here: https://developers.google.com/mediapipe/solutions/setup_python#supported_platforms_and_versions.

Furthermore, Python GPU support is exclusively provided for macOS and regular Linux, as outlined in the Python update section on the GitHub release page at https://github.com/google/mediapipe/releases/tag/v0.10.8.

At present, we regret to inform you that we are unable to offer support for this issue. Nevertheless, we are designating this as a feature request and promptly sharing it with our team. The prioritization of this request will be determined through team discussions.

Thank you!!

bigger-py commented 9 months ago

Hi,

That's disappointing to hear.

After more digging I made some more progress and managed to get it built and working using the pose landmarker model (BlazePose) but have not yet been able to get it executing on the tflite GPU delegate (just the CPU one).

I will continue digging, but any unofficial support would be greatly appreciated!

724877239 commented 8 months ago

Hi,

That's disappointing to hear. 听起来真令人失望。

After more digging I made some more progress and managed to get it built and working using the pose landmarker model (BlazePose) but have not yet been able to get it executing on the tflite GPU delegate (just the CPU one).经过更多的挖掘,我取得了更多的进展,并设法让它建立和工作使用姿势地标模型(BlazePose),但还没有能够让它执行的tflite GPU委托(只是CPU之一)。

I will continue digging, but any unofficial support would be greatly appreciated!我将继续挖掘,但任何非官方的支持将不胜感激!

Hello, did you successfully build on the Jetson Orin? I am currently experiencing the same issue as you, attempting to run the pose landmarker model (BlazePose) on the Jetson Orin NX, but unable to utilize the GPU. I would greatly appreciate it if you could assist me with any progress you have made.

bigger-py commented 8 months ago

Hi,

I haven't spent much more time on this but planning to revisit it sometime this week. If I can't make any more progress with mediapipe I'm probably just going to try converting BlazePose to a tensorrt model and stop using mediapipe altogether. I'll provide an update here if I do manage to make progress with mediapipe.

bigger-py commented 7 months ago

Hi, That's disappointing to hear. 听起来真令人失望。 After more digging I made some more progress and managed to get it built and working using the pose landmarker model (BlazePose) but have not yet been able to get it executing on the tflite GPU delegate (just the CPU one).经过更多的挖掘,我取得了更多的进展,并设法让它建立和工作使用姿势地标模型(BlazePose),但还没有能够让它执行的tflite GPU委托(只是CPU之一)。 I will continue digging, but any unofficial support would be greatly appreciated!我将继续挖掘,但任何非官方的支持将不胜感激!

Hello, did you successfully build on the Jetson Orin? I am currently experiencing the same issue as you, attempting to run the pose landmarker model (BlazePose) on the Jetson Orin NX, but unable to utilize the GPU. I would greatly appreciate it if you could assist me with any progress you have made.

Did you make any progress with this so far? I tried using the PINTO model zoo version and successfully got BlazePose running through TensorFlow on GPU but the output tensors of that need a lot more processing to turn it into a segmentation mask. So I am now convinced that the only sensible approach is to build Mediapipe since it already contains all the logic for turning BlazePose tensors into masks and landmarks...

724877239 commented 7 months ago

Hi, That's disappointing to hear. 听起来真令人失望。 After more digging I made some more progress and managed to get it built and working using the pose landmarker model (BlazePose) but have not yet been able to get it executing on the tflite GPU delegate (just the CPU one).经过更多的挖掘,我取得了更多的进展,并设法让它建立和工作使用姿势地标模型(BlazePose),但还没有能够让它执行的tflite GPU委托(只是CPU之一)。 I will continue digging, but any unofficial support would be greatly appreciated!我将继续挖掘,但任何非官方的支持将不胜感激!

Hello, did you successfully build on the Jetson Orin? I am currently experiencing the same issue as you, attempting to run the pose landmarker model (BlazePose) on the Jetson Orin NX, but unable to utilize the GPU. I would greatly appreciate it if you could assist me with any progress you have made.你好,你成功地建立在杰特森奥林?我目前遇到了与您相同的问题,试图在Jetson Orin NX上运行姿势地标模型(BlazePose),但无法使用GPU。如果你能帮助我取得任何进展,我将不胜感激。

Did you make any progress with this so far? I tried using the PINTO model zoo version and successfully got BlazePose running through TensorFlow on GPU but the output tensors of that need a lot more processing to turn it into a segmentation mask. So I am now convinced that the only sensible approach is to build Mediapipe since it already contains all the logic for turning BlazePose tensors into masks and landmarks...到目前为止你有什么进展吗我尝试使用平托模型动物园版本,并成功地让BlazePose在GPU上通过TensorFlow运行,但其输出张量需要更多的处理才能将其转换为分割掩码。所以我现在确信,唯一明智的方法是构建Mediapipe,因为它已经包含了将BlazePose张量转换为遮罩和地标的所有逻辑。

Hi, That's disappointing to hear. 听起来真令人失望。 After more digging I made some more progress and managed to get it built and working using the pose landmarker model (BlazePose) but have not yet been able to get it executing on the tflite GPU delegate (just the CPU one).经过更多的挖掘,我取得了更多的进展,并设法让它建立和工作使用姿势地标模型(BlazePose),但还没有能够让它执行的tflite GPU委托(只是CPU之一)。 I will continue digging, but any unofficial support would be greatly appreciated!我将继续挖掘,但任何非官方的支持将不胜感激!

Hello, did you successfully build on the Jetson Orin? I am currently experiencing the same issue as you, attempting to run the pose landmarker model (BlazePose) on the Jetson Orin NX, but unable to utilize the GPU. I would greatly appreciate it if you could assist me with any progress you have made.

Did you make any progress with this so far? I tried using the PINTO model zoo version and successfully got BlazePose running through TensorFlow on GPU but the output tensors of that need a lot more processing to turn it into a segmentation mask. So I am now convinced that the only sensible approach is to build Mediapipe since it already contains all the logic for turning BlazePose tensors into masks and landmarks...

Hi, I have made some progress. I through PINTO0309 / tilite2tensorflow project of the mediapipe pose_detection.tflite and pose_landmark.tflite converting tflite ONNX, Then build the pipeline with reference to the pre - and post-processing of the geaxgx/openvino_blazepose project. Successful inference using GPU on the jetson orin NX, this method should be universal and you can refer to building your own pipeline.

bigger-py commented 7 months ago

Hi, I have made some progress. I through PINTO0309 / tilite2tensorflow project of the mediapipe pose_detection.tflite and pose_landmark.tflite converting tflite ONNX, Then build the pipeline with reference to the pre - and post-processing of the geaxgx/openvino_blazepose project. Successful inference using GPU on the jetson orin NX, this method should be universal and you can refer to building your own pipeline.

That's good to hear! Thank you for directing me to that project. It looks like they've reimplemented the mediapipe pipeline in Python. At the moment I'm trying to avoid doing the same thing as it would involve refactoring a lot of existing code which was built around running mediapipe (on Windows/Intel/x86!). But it's a useful reference to have :)

I made some progress yesterday just in terms of getting the mediapipe hand-tracking and pose-detection examples running on the GPU. Today I hope to successfully build the Python bindings using what I've learnt so far... updates to follow!

bigger-py commented 7 months ago

Unfortunately I'm back to square 1 (ish) - I can get Python to build and everything but I'm stuck on how to get it running on the GPU TFLite delegate. At the moment with a script like (with the appropriately included .task file):

import mediapipe as mp
from mediapipe.tasks import python
from mediapipe.tasks.python import vision
import time
from mediapipe.tasks.python.components.containers import landmark as landmark_module

base_options = python.BaseOptions(model_asset_path='pose_landmarker_full.task')
options = vision.PoseLandmarkerOptions(
    base_options=base_options,
    num_poses=1,
    min_pose_detection_confidence = 0.5,
    min_pose_presence_confidence=0.5,
    min_tracking_confidence=0.5,
    running_mode=vision.RunningMode.IMAGE,
    output_segmentation_masks=True
)
model = vision.PoseLandmarker.create_from_options(options)

I get a terminal output like:

I0000 00:00:1710340943.716250    3701 gl_context_egl.cc:85] Successfully initialized EGL. Major : 1 Minor: 5
I0000 00:00:1710340943.756812    3790 gl_context.cc:357] GL version: 3.2 (OpenGL ES 3.2 NVIDIA 35.3.1), renderer: NVIDIA Tegra Orin (nvgpu)/integrated
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.

I noticed that the TensorFlow Lite delegate only gets created after the .task file has been read successfully, which suggests that the .task file has the delegate information within it, even though my pose detection modules specify to use the GPU delegate for inferencing.

Is there any way to read the contents of the .task file? @kuaashish any support would be greatly appreciated.

bigger-py commented 7 months ago

Hi, I have made some progress. I through PINTO0309 / tilite2tensorflow project of the mediapipe pose_detection.tflite and pose_landmark.tflite converting tflite ONNX, Then build the pipeline with reference to the pre - and post-processing of the geaxgx/openvino_blazepose project. Successful inference using GPU on the jetson orin NX, this method should be universal and you can refer to building your own pipeline.

Hi - I've started taking the approach you suggested now since there is no support from the Google team. I have the landmarker working - inference time is about 10-15ms on the landmarker. However, the detector seems to be outputting garbage and I'm not sure why. Could you share your .onnx file for the detector please? @724877239

724877239 commented 7 months ago

Hi, I have made some progress. I through PINTO0309 / tilite2tensorflow project of the mediapipe pose_detection.tflite and pose_landmark.tflite converting tflite ONNX, Then build the pipeline with reference to the pre - and post-processing of the geaxgx/openvino_blazepose project. Successful inference using GPU on the jetson orin NX, this method should be universal and you can refer to building your own pipeline.

Hi - I've started taking the approach you suggested now since there is no support from the Google team. I have the landmarker working - inference time is about 10-15ms on the landmarker. However, the detector seems to be outputting garbage and I'm not sure why. Could you share your .onnx file for the detector please? @724877239

Yes, you can give me your email and I will send it to you by email

bigger-py commented 7 months ago

Could you please attach it to a message here as a .zip file? e.g. this is mine... detector_model_float32.zip

I'm just not super comfortable sharing my personal email publicly and I don't think github supports private messages...

724877239 commented 7 months ago

Could you please attach it to a message here as a .zip file? e.g. this is mine... detector_model_float32.zip

I'm just not super comfortable sharing my personal email publicly and I don't think github supports private messages...

pose_detection.zip I used BlazePose for my project,Including detection and pose estimation

bigger-py commented 7 months ago

Could you please attach it to a message here as a .zip file? e.g. this is mine... detector_model_float32.zip I'm just not super comfortable sharing my personal email publicly and I don't think github supports private messages...

pose_detection.zip I used BlazePose for my project,Including detection and pose estimation

It's working correctly! Thank you very much :smile: Did you use the docker environment for converting from tflite to onnx? I tried to do it locally on my machine and I guess I didn't have the right dependencies installed or something.

724877239 commented 7 months ago

tilite2tensorflow

https://github.com/PINTO0309/tflite2tensorflow I used the above items for converting

zhenhao-huang commented 5 months ago

Could you please attach it to a message here as a .zip file? e.g. this is mine... detector_model_float32.zip I'm just not super comfortable sharing my personal email publicly and I don't think github supports private messages...

pose_detection.zip I used BlazePose for my project,Including detection and pose estimation

Are these two onnx files converted from tflite files extracted from task files?The result is not very good.

bigger-py commented 5 months ago

Could you please attach it to a message here as a .zip file? e.g. this is mine... detector_model_float32.zip I'm just not super comfortable sharing my personal email publicly and I don't think github supports private messages...

pose_detection.zip I used BlazePose for my project,Including detection and pose estimation

Are these two onnx files converted from tflite files extracted from task files?The result is not very good.

Hi,

The .tflite files I used were downloaded directly from Google's asset repo:

https://storage.googleapis.com/mediapipe-assets/

This has all of the mediapipe models as .tflite files, which you can then convert to .onnx and beyond.

I tried setting up my own environment for converting .tflite but it didn't work well for the pose detector and I was getting garbage output. I presume the other commenter used the docker container which worked properly?

Hope that helps.

zhenhao-huang commented 5 months ago

My fault.The model is ok.