microsoft / onnxruntime-genai

Generative AI extensions for onnxruntime
MIT License
514 stars 127 forks source link

onnxruntime-genai use CPU or GPU for inference on Android? #961

Closed lengjing606 closed 1 month ago

lengjing606 commented 1 month ago

Does onnxruntime-genai use CPU or GPU for inference on Android?

kunal-vaishnavi commented 1 month ago

Currently, ONNX Runtime GenAI uses CPU for inference on Android. The work is in progress to support GPU on Android, and the next release will support NPU on Android via the QNN EP.

lengjing606 commented 1 month ago

@kunal-vaishnavi Thanks for your reply