Closed YUNQIUGUO closed 5 months ago
I also have the same situation with my iphone 14 pro。 It takes at least 20 mins as my prompt is "hi".
It took ~5minutes on iPad Pro M1
Hi for this to work you need to be using at least a iPhone 14 or a device with a A16 chipset
updated prereqs to state minimum of iPhone 14 with A16 chipset https://github.com/Azure-Samples/Phi-3MiniSamples/pull/10
This works best on a A17 iPhone 15 Pro or ProMax
issues closed @stleon @scovin1109 @YUNQIUGUO issue is due to age of device apple iPhone only support AI Processor support with A16 or A17 chipsets iPhone 14 or iPhone 15 Pro models
FYI, he issue may not only be caused by not using the latest sets of iOS devices:
Upon further investigation on ORT side, looks like it's because previously we are not correctly detecting this "HasDotProductInstructions" support in our mlas platform cpu info: https://github.com/microsoft/onnxruntime/blob/e81c8676e3001c0c148b2d5495f90d048b2c9480/onnxruntime/core/mlas/lib/platform.cpp#L517 thus further causing not utilizing the MlasSQNBitGemm optimization at all.
We are working on an official fix will soon be out.
And with an updated local build, we are able to achieve similar results (vs. android) on an iphone 12 device with about 9~11 tokens/second - assuming would have even better perf for newer sets of devices.
this is our fix branch: Add CPUIDInfo::ArmAppleInit() to detect CPU features on Apple platforms. · microsoft/onnxruntime@e651436 (github.com) hopefully will soon go out with a patch ORT release
Hi!
Thanks for contributing all these work for Phi-3 model samples!
I tried the iOS sample from here and followed the instructions here:
https://github.com/Azure-Samples/Phi-3MiniSamples/tree/main/ios
However, in my local testing on an iphone 12 device, it seems like taking way long to generate a simple sentence. (the default prompt question in your app can take more than 10 minutes.)
Not sure if it is an expected behavior/you also see similar things on your end?
Thanks!