microsoft / Phi-3CookBook

This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks.
MIT License
2.26k stars 226 forks source link

Help needed to export in ONNX #83

Closed ladanisavan closed 2 months ago

ladanisavan commented 2 months ago

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [ x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Hi there,

I'm seeking guidance on exporting a custom fine-tuned Phi-3 Vision model to ONNX. I've followed the ONNX build model guide from this link.

The build command I used was: python3 -m onnxruntime_genai.models.builder -i ep_2_grad_32_lr_3e-5/ -o onnx_output/ -p int4 -e cuda --extra_options int4_block_size=32 int4_accuracy_level=4

The build process was successful and generated the following files:

However, the number of files generated doesn't match the file count in the official HF repo for ONNX microsoft/Phi-3-vision-128k-instruct-onnx-cuda

Files highlighted in red below are missing:

Screenshot 2024-07-05 at 5 56 00 PM

Additionally, while loading the model using ONNX Runtime, the following error occurs: OrtException: Load model from onnx_output failed: Protobuf parsing failed.

I have also noticed that sections for "embedding" and "vision" are missing from the genai_config.json

Can someone help me identify if I'm missing anything? Thanks

leestott commented 2 months ago

Hi @ladanisavan please open a issue on the onnxruntime github https://github.com/microsoft/onnxruntime and this issue is related to the instructions provided at https://onnxruntime.ai/docs/genai/howto/build-model.html

leestott commented 2 months ago

Please open a discussion issue on ONNX repo or the actual model repo.