Implement optimizations as in `ane_transformers`

huggingface / exporters

Export Hugging Face models to Core ML and TensorFlow Lite

Apache License 2.0

577 stars 35 forks source link

Implement optimizations as in `ane_transformers` #8

Open dimitry12 opened 1 year ago

dimitry12 commented 1 year ago

ane_transformers (https://github.com/apple/ml-ane-transformers and https://machinelearning.apple.com/research/neural-engine-transformers) suggest weight-compatible changes to transformers allowing better mapping of the ops to ANE and thus resulting in significant performance improvement.

@hollance do you think these optimizations "belong" in 🤗 Exporters? If yes, how do you envision their implementation: within CoreMLConfig abstraction or somewhere else?

hollance commented 1 year ago

Good question! Hugging Face has a library, Optimum, that can do optimizations such as pruning and quantization. It seems to me that these kinds of optimizations that require "model surgery" really belong into Optimum, but I'm not aware of any plans to add these particular optimizations. But it's definitely something worth considering (and it could be prototyped in Exporters).

dimitry12 commented 1 year ago

these kinds of optimizations that require "model surgery" really belong into Optimum, but I'm not aware of any plans to add these particular optimizations. But it's definitely something worth considering (and it could be prototyped in Exporters)

Makes sense! I can't commit to a particular timeline but I definitely plan to work in this direction. If that's ok, I will keep this issue open to post updates.