Open yeldarby opened 7 months ago
To resolve the issue where inference-gpu defaults to CPU execution on systems with CUDA 12, due to compatibility with onnxruntime, consider extending the platform.py script to detect the CUDA version. This can be achieved by invoking system commands to extract the CUDA version information and including this data within the retrieve_platform_specifics function's return dictionary. Once CUDA version detection is implemented, use this information in benchmark_adapter.py before running benchmarks or initializing models that require GPU support. Check the detected CUDA version and, if CUDA 12 is identified, ensure the appropriate version of onnxruntime-gpu that supports CUDA 12 is used.
When are we getting Cuda 12 support?
Search before asking
Bug
If you install
inference-gpu
on a machine with CUDA 12 it'll complain and fall back to CPU execution mode.There's a special version of onnxruntime needed for CUDA 12: https://onnxruntime.ai/docs/install/
Ideally we'd detect this automatically & make it "just work" with CUDA 12. But alternatively we could let the user know why they're not getting GPU acceleration.
Environment
pytorch/pytorch:2.2.0-cuda12.1-cudnn8-devel
docker with after runningpip install inference-gpu
Minimal Reproducible Example
Additional
No response
Are you willing to submit a PR?