How to use CUDA models?

lostmsu commented 1 week ago

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Clone this repo
Clone https://huggingface.co/microsoft/Phi-3-vision-128k-instruct-onnx-cuda
Open md/07.Labs/Csharp/src/LabsPhi3.sln
change modelPath to Phi-3-vision-128k-instruct-onnx-cuda\cuda-int4-rtn-block-32 in LabsPhi304 Program.cs https://github.com/microsoft/Phi-3CookBook/blob/058e289ecb2aca1bc5ea330c2da1bced919c1b93/md/07.Labs/Csharp/src/LabsPhi304/Program.cs#L31
run LabsPhi304

Any log messages given by the failure

OnnxRuntimeGenAIException: CUDA execution provider is not enabled in this build.

Expected/desired behavior

Sample works

OS and Version?

Windows 11 23H2
CUDA 11.8 and 12.1 installed

azd version?

N/A

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

leestott commented 1 week ago

Hi @lostmsu can you confirm your running this sample on your local machine with a Nvidia GPU which supports CUDA.

The error message “OnnxRuntimeGenAIException: CUDA execution provider is not enabled in this build” typically occurs when the ONNX Runtime library is unable to find the necessary components for GPU acceleration. Let’s troubleshoot this:

To check the CUDA version installed on your system, you can use one of the following methods:

Using nvcc (NVIDIA CUDA Compiler):

Open your command prompt or terminal. Run the following command:

nvcc --version

The output will display the CUDA compiler version, which corresponds to the toolkit version Example nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2021 NVIDIA Corporation Built on Thu_Nov_18_09:45:30_PST_2021 Cuda compilation tools, release 11.5, V11.5.119 Build cuda_11.5.r11.5/compiler.30672275_0 See https://developer.nvidia.com/cuda-downloads for version

Using nvidia-smi:

Open your command prompt or terminal. Run the following command:

nvidia-smi

Look for the “CUDA Version” in the top right corner of the output

Example nvidia-smi Fri Jun 28 11:59:55 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 555.42.02 Driver Version: 529.19 CUDA Version: 12.0 | |-----------------------------------------+------------------------+----------------------+

Check cuDNN Version:

Verify that cuDNN (CUDA Deep Neural Network library) is installed and compatible with your CUDA version. You can check the cuDNN version using:

cat /usr/local/cuda/include/cudnn_version.h

Python Environment:

Make sure you have a working Python environment with the necessary dependencies. Import the torch library (even if you’re not using it directly) as it initializes some CUDA-related components.

pip install torch torchvision torchaudio

import torch

ONNX Runtime

Now you need to reference CUDA for C# using the ONNX Runtime

Using CUDA and C#

leestott commented 4 days ago

@lostmsu Did you resolve your issue?

lostmsu commented 3 days ago

@leestott I was able to use CUDA version, but I did not follow any of the steps you provided. I believe the issue was that labs explicitly reference CPU builds of OnnxRuntime.

leestott commented 1 day ago

Closing this issue as the initial query was focused on changing the CPU Phi-3 model to a CUDA model.

microsoft / Phi-3CookBook