Closed leoc70 closed 1 year ago
I have the same version of CUDA installed including the referenced above cudnn64_8.dll, version: 6.14.11.6050
I have insignificantly modified your program; I am attaching it here. I have targeted netcoreapp3.1
and using version 1.13.1
.
But I am not able to reproduce the problem. This is what below produces:
D:\dev\data\gh_issue_13658$ D:\dev\data\gh_issue_13658\GPUCrash\bin\Debug\netcoreapp3.1\GPUCrash.exe D:\dev\data\gh_issue_13658\ONNXRuntime-model-debug\model.onnx Program Finished
using Microsoft.ML.OnnxRuntime.Tensors;
using Microsoft.ML.OnnxRuntime;
using System;
using System.Collections.Generic;
namespace GPUCrash
{
internal class Program
{
static void Main(string[] args)
{
var modelPath = args[0];
bool useGPU = true;
InferenceSession session = null;
if (useGPU)
{
var cudaProviderOptions = new OrtCUDAProviderOptions();
var providerOptionsDict = new Dictionary<string, string>();
providerOptionsDict["device_id"] = "0";
providerOptionsDict["gpu_mem_limit"] = "2147483648";
providerOptionsDict["arena_extend_strategy"] = "kSameAsRequested";
providerOptionsDict["cudnn_conv_algo_search"] = "DEFAULT";
providerOptionsDict["do_copy_in_default_stream"] = "1";
providerOptionsDict["cudnn_conv_use_max_workspace"] = "1";
providerOptionsDict["cudnn_conv1d_pad_to_nc1d"] = "1";
cudaProviderOptions.UpdateOptions(providerOptionsDict);
using var options = SessionOptions.MakeSessionOptionWithCudaProvider(cudaProviderOptions);
session = new InferenceSession(modelPath, options);
}
else
session = new InferenceSession(modelPath);
using var sess = session;
int w = 128;
int h = 128;
Tensor<float> input = new DenseTensor<float>(new int[] { 1, 3, h, w });
Random random = new Random(42);
for (int y = 0; y < h; y++)
{
for (int x = 0; x < w; x++)
{
input[0, 0, y, x] = (float)(random.NextDouble() / 255);
input[0, 1, y, x] = (float)(random.NextDouble() / 255);
input[0, 2, y, x] = (float)(random.NextDouble() / 255);
}
}
var inputs = new List<NamedOnnxValue> { NamedOnnxValue.CreateFromTensor<float>("modelInput", input) };
using IDisposableReadOnlyCollection<DisposableNamedOnnxValue> results = session.Run(inputs); // The crash is when executing this line
System.Console.WriteLine("Program Finished");
}
}
}
My recommendation would be to stop in the debugger right before Run()
and examine where the native CUDA dlls are being loaded from.
In my case. CUDA libraries are loaded from here:
And, what is also very important, the onnxruntime libraries are loaded from 1.13.1 restored NuGet package, and not from anywhere else.
@yuslepukhin I did what you mention and I found my mistake. I forgot to download the Zlib dll and add it to my PATH. After that everything was running fine.
@yuslepukhin I did what you mention and I found my mistake. I forgot to download the Zlib dll and add it to my PATH. After that everything was running fine.
I suggest you review the DLL search order on Windows (and on other OS you might use), PATH
is just one of the many things that affect it. In the dev environment that I demonstrated above PATH
has nothing to do with it.
Then we can spend more time discussing onnxruntime.
Describe the issue
I trained a PyTorch image segmentation model in python and converted it to an ONNX model. The inference in python on CPU or GPU is working. In my C# application (.NET 6) the inference on CPU is fine but when I try to run it GPU my application crash without any exception.
I have only an event in the Windows 10 Event Viewer :
I installed CUDA v11.6 and extrated CUDNN v8.5.0.96 and add the following environnement system variables :
To reproduce
My models in pytorch or onnx format : https://github.com/leoc70/ONNXRuntime-model-debug
Here is my code in Python and C#
Python 3.10 64bit
C .NET 6
Urgency
I am working on a project coded in C#. So doing the inference have to be done in C# but the training can be in python. If I am not able to run the inference in C# with good performance (<400ms), I will have to find a other solution.
Platform
Windows
OS Version
Windows 10 22H2 OS build 19045.2251
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
nuget Microsoft.ML.OnnxRuntime.Gpu version 1.13.1
ONNX Runtime API
C#
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
CUDA 11.6