Open abysslover opened 1 day ago
With only the Face Detect model(single CUDA session on a GPU device):
[FaceDetect.PostProcess] Output classificator[0]: 896, Output_regrssor: 14336, Thr: 0.6 [FaceDetect.NonMaxSuppression] # Original Faces: 5, # Filtered Faces: 1
With Face detection and Face mesh(multi CUDA sessions on a GPU device, sequential inference):
[FaceDetect.PostProcess] Output classificator[0]: 896, Output_regrssor: 14336, Thr: 0.6 [FaceDetect.NonMaxSuppression] # Original Faces: 0, # Filtered Faces: 0
Describe the issue
I encountered an issue with ONNX Runtime when running CUDA sessions in Unity. In Python, I am able to create three(mutiple) CUDA sessions for my models on a single graphic card and run them sequentially for inference without any issues. The GPU is utilized correctly, and each model returns the expected predictions.
However, when attempting to replicate this setup in Unity:
If I create two CUDA sessions and run the models sequentially, the inference runs without errors, but the output values are empty. The same models work perfectly in Python with multiple CUDA sessions, but in Unity, only the first CUDA session seems to work as intended. Additional context:
To reproduce
Python code equivalent (working):
Unity code (not working as expected):
Explanation of Behavior Change When faceMesh is commented out: The code only initializes and runs the face detection model (faceDetect). In this case, the application will only perform face detection and not the more detailed face mesh analysis. Since only one model (face detection) is loaded, the ONNX Runtime is managing a single CUDA session, which might work without any issues.
When faceMesh is not commented out: Both the face detection model (faceDetect) and the face mesh model (faceMesh) are initialized. This creates two CUDA sessions using the same OrtCUDAProviderOptions. Initializing multiple sessions with the same CUDA provider settings may lead to conflicts in internal graph, resulting in empty outputs. This could explain why, when both models are used sequentially in Unity, the output values are empty.
Important logs:
Urgency
This issue is blocking a critical use case in our project. We need to run multiple models sequentially using CUDA sessions in Unity. Any delay in resolving this issue would impact our project timeline significantly.
Platform
Windows
OS Version
Windows 11 Pro
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
291a5352b27ded5714e5748b381f2efb88f28fb9
ONNX Runtime API
C#
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
CUDA 12.5.1, CUDNN 9.4, TensorRT 10.4.0.26