When loading any mxnet (as mx in the picture) function, mx.nd.ones or mx.gpu(0) for example, I get a warning about cudnn mismatch after an error in cublas.
This cublas error is solved when instead of executing mx.gpu(0), i execute mx.gpu(). In both cases, GPU is detected.
Also, I execute onnxruntime-gpu, but it looks like mxnet throws the warning. However, when I do inference in GPU (using onnxruntime.InferenceSession with CUDAExecutionProvider) I noticed an underutilisation of the GPU (using no more than 40-50 % at some peaks in 100ms using nvidia-smi -lms 100, when most of the time is 0 to 6% ).
I wonder if this underutilization is due to the mismatching. In that case, is there any solution to eliminate the warning without installing another version of cudnn or cuda?
Thanks!
Izan.
To reproduce
I use Onnx Models from Onnx Model Zoo (https://github.com/onnx/models), particularly vgg16, resnet50, mobilenet, densenet, all of them quantize and no quantize (int8 and float32).
I use Mxnet framework to load the Imagenet dataset and calculate accuracies (mxnet-cu112 version 1.9.1) to allow inference with cuda 11.2.
As you can see on the pictures, CUDA and CPU providers are detected before and after creating InferencesSession, so it should be okay, but GPU performance is lower than expected.
set MXNET_CUDNN_LIB_CHECKING=0 doesn't completely solve it? (the suggestion in your screenshot). Note that these errors are not from onnxruntime but from the cuda libraries.
Describe the issue
When loading any mxnet (as mx in the picture) function, mx.nd.ones or mx.gpu(0) for example, I get a warning about cudnn mismatch after an error in cublas.
This cublas error is solved when instead of executing mx.gpu(0), i execute mx.gpu(). In both cases, GPU is detected.
Also, I execute onnxruntime-gpu, but it looks like mxnet throws the warning. However, when I do inference in GPU (using onnxruntime.InferenceSession with CUDAExecutionProvider) I noticed an underutilisation of the GPU (using no more than 40-50 % at some peaks in 100ms using nvidia-smi -lms 100, when most of the time is 0 to 6% ).
In https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements, I found that onnxruntime-gpu 1.12 is linked with cuda 11.4 and cudnn 8.2.4.
I wonder if this underutilization is due to the mismatching. In that case, is there any solution to eliminate the warning without installing another version of cudnn or cuda?
Thanks! Izan.
To reproduce
I use Onnx Models from Onnx Model Zoo (https://github.com/onnx/models), particularly vgg16, resnet50, mobilenet, densenet, all of them quantize and no quantize (int8 and float32).
I use Mxnet framework to load the Imagenet dataset and calculate accuracies (mxnet-cu112 version 1.9.1) to allow inference with cuda 11.2.
As you can see on the pictures, CUDA and CPU providers are detected before and after creating InferencesSession, so it should be okay, but GPU performance is lower than expected.
Urgency
No response
Platform
Linux
OS Version
Ubuntu 18.04
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
onnxruntime-gpu 1.12.0
ONNX Runtime API
Python
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
Cuda 11.2 with Cudnn 8.2.1
Model File
No response
Is this a quantized model?
Yes