microsoft / onnxruntime-genai

Generative AI extensions for onnxruntime
MIT License
414 stars 94 forks source link

CSharp examples crash on attempt to load model in Cuda mode (Release or Debug) #716

Closed asmirnov82 closed 1 month ago

asmirnov82 commented 1 month ago

Describe the bug

Unhandled exception. System.DllNotFoundException: Unable to load DLL 'onnxruntime-genai' or one of its dependencies: The specified module could not be found. (0x8007007E) at Microsoft.ML.OnnxRuntimeGenAI.NativeMethods.OgaCreateModel(Byte[] configPath, IntPtr& model) at Microsoft.ML.OnnxRuntimeGenAI.Model..ctor(String modelPath) at Program.Run(String modelPath) at Program.Main(String[] args)

To Reproduce Steps to reproduce the behavior:

  1. Build any of the CSharp example (Genny, HelloPhi or HelloPhi3V) in Release_Cuda or Debug_Cuda Configuration
  2. Run the example and use phi3 cuda optimized model (for example from https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx/tree/main/cuda/cuda-int4-rtn-block-32)

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

baijumeswani commented 1 month ago

My guess here is that the cuda dependency was not found. Currently, onnxruntime-genai stable release depends on cuda 11.8. And the onnxruntime dependency depends on cuda 11.8 and cudnn 8.x for cuda 11.8.

If the cuda toolkit bin directory is not in your PATH or if the installed cuda toolkit version differs from what is expected, that would explain why you're seeing this error.

skyline75489 commented 1 month ago

The crash is likely to be caused by ORT NuGet package. There's a packaging error in the latest ORT NuGet that has incorrect CUDA version, which leads to DLL loading failure.

See: https://github.com/microsoft/onnxruntime/issues/20916#issuecomment-2158872185

baijumeswani commented 1 month ago

I hope the above suggestions helped. Closing this issue now. Please let us know if you run into more problems.