Open dhl899 opened 1 month ago
Referring to https://github.com/microsoft/onnxruntime-genai/issues/961, will it address the memory aspect of running big models on the npu? thanks
Hi we're still working on QNN support. The NPU has it's own limitation when it comes to memory usage, depends on the specific device specification. Likely it will not fully address the memory issue running big models.
Referring to https://github.com/microsoft/onnxruntime-genai/issues/961, will it address the memory aspect of running big models on the npu? thanks