microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.45k stars 2.9k forks source link

[Feature Request] MPS provider #21271

Open barakugav opened 3 months ago

barakugav commented 3 months ago

Describe the feature request

Currently there is no MPS (apple Metal Performance Shaders GPU) provider. There is CoreML provider but from testing I did on my MacBook with arm64 M3 and MPS It does not use it. Is there any plan to support MPS as a different provider or within CoreML? I think there is a lot of value in such support, as a lot of the development is done on MacBooks.

Describe scenario use case

skottmckay commented 2 months ago

Did you check to see which nodes CoreML was actually assigned. And were you using NeuralNetwork or ML Program?

If not, set log level to VERBOSE and look for 'Node placements' in the output.

Obviously resources aren't infinite and creating a new execution provider is a huge undertaking. CoreML is currently preferred as it offers access to GPU and NPU and we're increasing operator coverage for ML Program (as NeuralNetwork is deprecated).

A general-purpose GPU EP based on Vulkan (with Metal support via MoltenVK) may be feasible as it would run on multiple platforms. But the same condition will apply - the EP needs to support enough operators in the model to be effective.

Rikyf3 commented 2 months ago

@skottmckay could you provide more details on the roadmap for CoreML EP?

Currently, the limited support for operators significantly slows down inference for most modern models. For example, operators like LayerNorm, GroupNorm, GeLU, and non-nearest neighbor resizing are not supported.

Additionally, many GitHub issues related to CoreML remain unanswered. While I understand the constraints on resources, it’s crucial for us developers to know whether ONNX Runtime with CoreML EP is a viable option or if we should consider alternatives. In my specific case, a model that takes one minute to run on DirectML EP takes 50 minutes on CoreML.

skottmckay commented 2 months ago

We typically add operator support based on production use cases. If you have such a use case please provide details and the operators required.

e.g. we recently added ML Program support for quite a few operators, including Resize (although there are caveats with Resize as we're not able to map all the ONNX options to equivalent CoreML parameters, and CoreML seems to have inconsistent implementations of some operators): https://github.com/microsoft/onnxruntime/pulls?q=is%3Apr+coreml+is%3Aclosed

We would most likely limit new operator support to ML Program though as NeuralNetwork is deprecated. ML Program requires iOS 15+ or macOS 12+.

There may also be differences in what CoreML operators are available and what options they support vs. ONNX.

GeLU is supported, as is LayerNorm. GroupNorm may map to CoreML batch_norm.

Rikyf3 commented 2 months ago

@skottmckay, thank you for your prompt response.

I’m currently using onnxruntime==1.18.1 on macOS, so it seems that the improvements you mentioned might be available in the upcoming version. I have implemented a U-Net model in PyTorch, and based on the verbosity log, I noticed that GeLU is decomposed into an Erf along with multiplications and sums, but Erf is not supported. Similarly, GroupNorm decomposes into Reshape + InstanceNorm + Reshape, and none of these operators are supported (Reshape for dynamic shape issue).

I will attach my unet implementation if you need it. unet.py.zip

divideconcept commented 2 weeks ago

It'd be great is there was another mac alternative to CoreML, which unfortunately cannot run a lot of models... And is still limited to have tensor size dimension <=16384. MPS would be my preferred choice as it does not have this 16384 limitation, and PyTorch seems to run all models I need with their MPS EP. CoreML seems to be very low priority at Apple, they don't seem to care at all about improving this tech for years...