Open pkbullock opened 2 months ago
I wonder if this is related to onnxruntime-genai still awaiting QNN support.
This is listed in the docs as supports AI Copilot PC but it doesnt, my NPU activity is 0%. So how to use this?
I don't see any reference yet to CoPilot+ PC in the AI Toolkit docs, at least not here. Because it relies on onnxruntime-genai, I believe QNN support must land there first before AI Toolkit can take full advantage of it. You might be able to take some advantage of the NPU now, indirectly, by using DirectML with a model like Phi-3-mini-4k-directml-int4-awq-block-128-onnx
which is optimized for that. I have been using DirectML on my non-CoPilot Qualcomm-based WDK23 to speed up training.
Hi @sirredbeard - I saw it in the release notes on installation of the VSCode extension with the mention of support. But I agree seems many frameworks are dependent on the QNN runtimes/sdks being release.
It seems like direct-ml
models don't show up in the model catalog on my PC that has a QC NPU
Me neither - what is the course of action to enable models to show up on Snapdragon machines ?
It would be great to see if AI Toolkit can leverage the NPU in Copilot PCs. Currently this uses the CPU, its nice a quick on the Snapdragon processors but not using the AI processor when running models.