Closed escon1004 closed 1 month ago
I've encountered the exact same issue. tested on linux/amd64, cpu.
Just executing the tutorial (without converting, just use pre-converted model) : https://github.com/microsoft/onnxruntime-genai?tab=readme-ov-file#sample-code-for-phi-3-in-python does work.
but after converting on my own environments (https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/generate-e2e-example.sh), it failed with same error messages.
Hi @escon1004 and @anencore94, there was a change in transformers code that caused this incompatibility with onnxruntime-genai. This will be resolved in the next release (0.5.0) coming at the end of October. In the mean time, there are two alternative workarounds that you can employ:
@natke Thanks for the reply :). Would you mind to tell me the specified change from v4.45.0 (which is the version in which the above change was introduced)
?
@natke Thanks for the reply :). Would you mind to tell me the specified change from
v4.45.0 (which is the version in which the above change was introduced)
?
In these PR, https://github.com/huggingface/transformers/pull/32535, they upgraded the tokenizer to the latest version which introduced the new schema in tokenizer merge ranks.
Closing this issue. Please -re-open or let us know if you experience any further issues.
Hello everyone,
I'm excited to be using ONNX Runtime GenAI. It's an amazing library for anyone looking to run models on their device. I've been learning how to use ONNX GenAI by following various tutorials
https://onnxruntime.ai/docs/genai/tutorials/phi2-python.html
I've tried building two models: Gemma2-2B and Phi3.5-Mini-Instruct
Both seem to work quite well.
The model builds successfully, and when I tried to load it, there didn't seem to be any issues with the model I created.
However, when I tried to load the tokenizer from the model, an error occurred
here are my own config files (genai_config.json)
genai_config.json of gemma2-2b
genai_config.json of phi3.5-mini-instruct
I'm not exactly sure about the difference, but I noticed that the file list for the tokenizer is slightly different from the pre-trained model uploaded on Hugging Face.
I also noticed that my own tokenizer files are the same as the original files from Hugging Face. https://huggingface.co/microsoft/Phi-3.5-mini-instruct/tree/main https://huggingface.co/google/gemma-2-2b/tree/main
Are there any specific requirements before building the ONNX model file? Should I convert the tokenizer format before starting?