microsoft / onnxruntime-genai

Generative AI extensions for onnxruntime
MIT License
450 stars 107 forks source link

Does Microsoft.ML.OnnxRuntimeGenAI.Cuda (version 0.4.0) support Phi-3.5 Vision Onnx format? #943

Open MaxAkbar opened 5 days ago

MaxAkbar commented 5 days ago

Describe the bug After migrating Phi-3.5-vision-instruct to Onnx format I am not able to use the NuGet package Microsoft.ML.OnnxRuntimeGenAI.Cuda version 0.4.0 to load the Onnx model. When referencing the folder where the Onnx model is I get an error that file not found.

Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: 
'Load model from C:\Users\***\source\repos\models\microsoft\Phi-3.5-vision-instruct-to-onnx\phi-3.5-v-128k-instruct-vision-onnx\ 
failed:Load model C:\Users\***\source\repos\models\microsoft\Phi-3.5-vision-instruct-to-onnx\phi-3.5-v-128k-instruct-vision-onnx\ failed. 
File doesn't exist'

To Reproduce Steps to reproduce the behavior:

  1. Follow instructions for converting the Phi-3.5-vision-instruct to onnx format.
  2. Create a simple c# console application and load the model:
    
    using Microsoft.ML.OnnxRuntimeGenAI;

string modelPath = @"C:\models\microsoft\Phi-3.5-vision-instruct-onnx"; using Model model = new Model(modelPath);


3. After running the application you will get an error `Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException`
4. See error in description above.

**Expected behavior**
The expected behavior is to have the model loaded and be able to run inference.

**Desktop (please complete the following information):**
 - OS: Windows 11 Pro 
 - Build: Version 10.0.26120 Build 26120
 - Browser Edge

**Additional context**
I have converted the [Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) to [Phi-3.5-mini-instruct-cuda-fp32-onnx](https://huggingface.co/Maximum2000/Phi-3.5-mini-instruct-cuda-fp32-onnx) and able to run it without any issues. 
kunal-vaishnavi commented 2 days ago

The Phi-3 vision and Phi-3.5 vision models are split into three separate ONNX models: a vision component, an embedding component, and a text component. The build.py file in the instructions you linked should create all three components for you.

According to your error, the vision component cannot be found. Can you check your modelPath folder to see if you have any subfolders named vision_init_export, vision_after_export, or vision_after_opt? It's possible that something failed during the export --> optimize --> quantize process for creating the vision component. If the process failed at any point, then the latest vision component is temporarily saved in one of those subfolders before it is finally saved in modelPath. You may need to delete the modelPath folder and then re-run the build.py file with the latest ONNX Runtime version installed so that the process does not fail.

Please note that re-designed ONNX models for Phi-3 vision and Phi-3.5 vision will be published to enable multi-image support.