I'm seeking guidance on exporting a custom fine-tuned Phi-3 Vision model to ONNX. I've followed the ONNX build model guide from this link.
The build command I used was:
python3 -m onnxruntime_genai.models.builder -i ep_2_grad_32_lr_3e-5/ -o onnx_output/ -p int4 -e cuda --extra_options int4_block_size=32 int4_accuracy_level=4
The build process was successful and generated the following files:
genai_config.json
model.onnx
model.onnx.data
special_tokens_map.json
tokenizer.json
tokenizer_config.json
However, the number of files generated doesn't match the file count in the official HF repo for ONNX microsoft/Phi-3-vision-128k-instruct-onnx-cuda
Files highlighted in red below are missing:
Additionally, while loading the model using ONNX Runtime, the following error occurs:
OrtException: Load model from onnx_output failed: Protobuf parsing failed.
I have also noticed that sections for "embedding" and "vision" are missing from the genai_config.json
Can someone help me identify if I'm missing anything? Thanks
Describe the issue
Hi there,
I'm seeking guidance on exporting a custom fine-tuned Phi-3 Vision model to ONNX. I've followed the ONNX build model guide from this link.
The build command I used was:
python3 -m onnxruntime_genai.models.builder -i ep_2_grad_32_lr_3e-5/ -o onnx_output/ -p int4 -e cuda --extra_options int4_block_size=32 int4_accuracy_level=4
The build process was successful and generated the following files:
However, the number of files generated doesn't match the file count in the official HF repo for ONNX
microsoft/Phi-3-vision-128k-instruct-onnx-cuda
Files highlighted in red below are missing:
Additionally, while loading the model using ONNX Runtime, the following error occurs:
OrtException: Load model from onnx_output failed: Protobuf parsing failed.
I have also noticed that sections for "embedding" and "vision" are missing from the
genai_config.json
Can someone help me identify if I'm missing anything? Thanks
To reproduce
follow the steps provided above
Urgency
No response
Platform
Linux
OS Version
Ubuntu 24.04
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
0.3.0
ONNX Runtime API
Python
Architecture
X86
Execution Provider
CUDA
Execution Provider Library Version
CUDA 12.1