Closed lengjing606 closed 1 month ago
Currently, ONNX Runtime GenAI uses CPU for inference on Android. The work is in progress to support GPU on Android, and the next release will support NPU on Android via the QNN EP.
@kunal-vaishnavi Thanks for your reply
Does onnxruntime-genai use CPU or GPU for inference on Android?