microsoft / onnxruntime-inference-examples

Examples for using ONNX Runtime for machine learning inferencing.
MIT License
1.2k stars 336 forks source link

Android - App crash when using Phi-3 #444

Open XantaKross opened 4 months ago

XantaKross commented 4 months ago

After compiling the example Android app for Phi-3 I've downloaded the model file from hugging face seperately and placed it in /data/data/.../files, however when trying to run the app always crashes it says something about "downloading model for app..." Then goes to "model exists skipping download" then immediately crashes. Using logcat of Android studio I've narrowed the error down to this line in the log:

Deserialize tensor model.layers.31.mlp.up_proj.MatMul.weight_Q4 failed.tensorprotoutils.cc:904 GetExtDataFromTensorProto External initializer: model.layers.31.mlp.up_proj.MatMul.weight_Q4 offset: 2628268032 size to read: 12582912 given file_length: 2354700288 are out of bounds or can not be read in full.

Any ideas on how I can fix it?

edgchen1 commented 2 months ago

Can you verify that the file phi3-mini-4k-instruct-cpu-int4-rtn-block-32-acc-level-4.onnx.data is valid?

offset: 2628268032 size to read: 12582912 given file_length: 2354700288 are out of bounds or can not be read in full.

This looks suspicious. The file length is smaller than that of the one I just downloaded from https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx/tree/main/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4.

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a----         8/13/2024   2:10 PM     2722861056 phi3-mini-4k-instruct-cpu-int4-rtn-block-32-acc-level-4.onnx.data

Alternatively, it shouldn't be necessary to download the model files separately in the current version of that example app. You might also try just letting the app download the files.