microsoft / onnxruntime-genai

Generative AI extensions for onnxruntime
MIT License
481 stars 120 forks source link

Help needed to export Phi3v in ONNX #685

Closed ladanisavan closed 3 months ago

ladanisavan commented 3 months ago

Hi there,

I'm seeking guidance on exporting a custom fine-tuned Phi-3 Vision model to ONNX. I've followed the ONNX build model guide from this link.

The build command I used was: python3 -m onnxruntime_genai.models.builder -i ep_2_grad_32_lr_3e-5/ -o onnx_output/ -p int4 -e cuda --extra_options int4_block_size=32 int4_accuracy_level=4

The build process was successful and generated the following files:

However, the number of files generated doesn't match the file count in the official HF repo for ONNX microsoft/Phi-3-vision-128k-instruct-onnx-cuda

Files highlighted in red below are missing:

Screenshot 2024-07-05 at 5 56 00 PM

Additionally, while loading the model using ONNX Runtime, the following error occurs: OrtException: Load model from onnx_output failed: Protobuf parsing failed.

I have also noticed that sections for "embedding" and "vision" are missing from the genai_config.json

Can someone help me identify if I'm missing anything? Thanks

kunal-vaishnavi commented 3 months ago

The Phi-3 vision ONNX models are created as follows.

I can open-source the scripts used to create these ONNX models and run them with ONNX Runtime GenAI.

2U1 commented 3 months ago

@kunal-vaishnavi If you open-source it, I would really appreciate it!

ladanisavan commented 3 months ago

@kunal-vaishnavi open-source scripts would be really helpful to the Phi community.

kunal-vaishnavi commented 3 months ago

I have uploaded the necessary files in each of the Hugging Face repos and created this PR to show how to use them.

tgalery commented 1 month ago

Quick question, would the same guide work for Phi3.5 vision model family ?

kunal-vaishnavi commented 1 month ago

Yes, but the num_crops value in processor_config.json here needs to be set to 4 for Phi-3.5 vision as the value has changed.

Please note that this guide only works for a single image, however. We have re-designed the ONNX models so that there is multi-image support for both Phi-3 vision and Phi-3.5 vision. As mentioned here, the new ONNX models are undergoing Microsoft's Responsible AI evaluations before they can be published officially.

The changes needed within ONNX Runtime GenAI for the new ONNX models have already been merged in this PR. A revised guide as well as a new ONNX Runtime GenAI stable release will be published together to support this work.