Previously, I tried to extend the Python API with the ability to keep the data on the GPU (https://github.com/stereolabs/zed-python-api/pull/230), and I ran into some weird behaviors (back then they were weird, but now, it's obvious that it was just a lack on understanding of how the data is laid out in memory).
This PR, however, provides a fully functional extension.
I used the generic YOLO (from ultralytics import YOLO), and a custom trained Pytorch YOLOV8 model.
I added the sleep because in the case of HD2K grabbing, my pipeline wasn't saturating the 15FPS rate, thus grabbing was seemingly slower in GPU (faulty read).
The preprocessing includes 4 channel to 3 channel reduction, resizing (to meet the 640x640 expected input), and normalization.
Previously, I tried to extend the Python API with the ability to keep the data on the GPU (https://github.com/stereolabs/zed-python-api/pull/230), and I ran into some weird behaviors (back then they were weird, but now, it's obvious that it was just a lack on understanding of how the data is laid out in memory).
This PR, however, provides a fully functional extension.
NOTE: this change adds an extra dependency; cupy.
The targeted function is
get_data()
, and both modes of providing data (memory view / deep copy) were implemented for GPU as well.This was tested on an
Nvidia AGX Orin 32Gb
, withJetPack 5.1.2
, andZED_SDK_4.1.4
.Shoutout to @andreacelani for the discussion that lead to figuring out how to implement this correctly (look into the closed PR #230 for details).
Benchmarking with an ML pipeline:
@andreacelani did some benchmarking with impressive results: https://github.com/stereolabs/zed-python-api/pull/230#issuecomment-2347310516
Additionally, I tested it myself using a real feed from a ZED Mini with a simple pipeline (see picture), and here are my findings:
TL;DR:
Details:
Notes:
from ultralytics import YOLO
), and a custom trainedPytorch
YOLOV8 model.HD2K
grabbing, my pipeline wasn't saturating the15FPS
rate, thus grabbing was seemingly slower in GPU (faulty read).4 channel
to3 channel
reduction, resizing (to meet the 640x640 expected input), and normalization.PCL
just to simulate real work. (code details are here https://github.com/stereolabs/zed-python-api/pull/230#issuecomment-1787067065.)