Help needed to export Phi3v in ONNX

ladanisavan commented 3 months ago

Hi there,

I'm seeking guidance on exporting a custom fine-tuned Phi-3 Vision model to ONNX. I've followed the ONNX build model guide from this link.

The build command I used was: python3 -m onnxruntime_genai.models.builder -i ep_2_grad_32_lr_3e-5/ -o onnx_output/ -p int4 -e cuda --extra_options int4_block_size=32 int4_accuracy_level=4

The build process was successful and generated the following files:

genai_config.json
model.onnx
model.onnx.data
special_tokens_map.json
tokenizer.json
tokenizer_config.json

However, the number of files generated doesn't match the file count in the official HF repo for ONNX microsoft/Phi-3-vision-128k-instruct-onnx-cuda

Files highlighted in red below are missing:

Additionally, while loading the model using ONNX Runtime, the following error occurs: OrtException: Load model from onnx_output failed: Protobuf parsing failed.

I have also noticed that sections for "embedding" and "vision" are missing from the genai_config.json

Can someone help me identify if I'm missing anything? Thanks

kunal-vaishnavi commented 3 months ago

The Phi-3 vision ONNX models are created as follows.

The vision component (phi-3-v-128k-instruct-vision.onnx) is created using torch.onnx.export with some modifications to the original PyTorch source code.
The text embedding component (phi-3-v-128k-instruct-text-embedding.onnx) is created using the ONNX helper APIs.
The text component (phi-3-v-128k-instruct-text.onnx) is created using the model builder with --extra_options exclude_embeds=true enabled. The model builder prints a warning that only the text component is created. https://github.com/microsoft/onnxruntime-genai/blob/00ceb80e1984c408459dfabe92a5b4eb97318578/src/python/py/models/builder.py#L2387-L2388
The genai_config.json and processor_config.json are created manually.

I can open-source the scripts used to create these ONNX models and run them with ONNX Runtime GenAI.

2U1 commented 3 months ago

@kunal-vaishnavi If you open-source it, I would really appreciate it!

ladanisavan commented 3 months ago

@kunal-vaishnavi open-source scripts would be really helpful to the Phi community.

kunal-vaishnavi commented 3 months ago

I have uploaded the necessary files in each of the Hugging Face repos and created this PR to show how to use them.

tgalery commented 1 month ago

Quick question, would the same guide work for Phi3.5 vision model family ?

kunal-vaishnavi commented 1 month ago

Yes, but the num_crops value in processor_config.json here needs to be set to 4 for Phi-3.5 vision as the value has changed.

Please note that this guide only works for a single image, however. We have re-designed the ONNX models so that there is multi-image support for both Phi-3 vision and Phi-3.5 vision. As mentioned here, the new ONNX models are undergoing Microsoft's Responsible AI evaluations before they can be published officially.

The changes needed within ONNX Runtime GenAI for the new ONNX models have already been merged in this PR. A revised guide as well as a new ONNX Runtime GenAI stable release will be published together to support this work.

microsoft / onnxruntime-genai

Help needed to export Phi3v in ONNX #685