microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
13.42k stars 2.75k forks source link

OpenCL and Mali GPU support left out of all execution providers #20896

Open federicoparra opened 1 month ago

federicoparra commented 1 month ago

I was trying to migrate from MLC-LLM to onnxruntime to run Phi-3 on an Orange Pi 5 but I realize that among ALL your execution providers there isn't a single one that takes advantage of the GPU or NPU!

Such a shame. You specifically develop support for ARM ACL and ARMNN, both which specifically support Mali GPU, but you only provide support for CPU (???).

Then you support the RKNN framework...but only for a single model of the chip.

Then you support TVM in preview, but only relay not relax, so can't be used for LLMs.

Please! The adoption of onnxruntime for LLMs is on the line.

EmmaNingMS commented 3 weeks ago

Hi, ARM ACL and ARMNN EP was an externally contribution by ARM. Need to check with them on the further maintenance and improvements. Regarding NPU, ORT with QNN EP supports Qualcomm NPU and has been verified with Qualcomm AI Hub models. You can compile it from source for mobile usage https://onnxruntime.ai/docs/execution-providers/QNN-ExecutionProvider.html , and we're seeking customer signals for releasing the package. Additionally, we are investigating GPU acceleration in ORT Mobile. Stay tuned. Sharing your scenarios and expectations with ORT would assist us in planning the work.

alawasoft commented 4 days ago

+1 for GPU / NPU provider