microsoft / onnxruntime-genai

Generative AI extensions for onnxruntime
MIT License
157 stars 41 forks source link

Phi-3 on mobile #316

Open cvb941 opened 3 weeks ago

cvb941 commented 3 weeks ago

Hi, is there a guide on how to to use the Phi-3 ONNX models (https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx) on mobile (iOS + Android)?

Do I understand correctly, that the raw model needs to be preprocessed first? The guide mentions using model-qa.py, which seems to do the preprocessing behind the scenes using the new Generate() API, but the script only outputs the inference. Is there a way to also export the preprocessed model, to use in ONNX runtime on mobile?

natke commented 2 weeks ago

@cvb941 We have an android sample here that uses the C API temporarily while we work on Java and Objective-C bindings. PR should be merged very soon https://github.com/microsoft/onnxruntime-inference-examples/pull/420

natke commented 1 week ago

And to answer your question about pre processing, yes the generate() API takes care of pre processing. You do not need to add the code for this in you application