hollance / YOLO-CoreML-MPSNNGraph

Tiny YOLO for iOS implemented using CoreML but also using the new MPS graph API.
MIT License
929 stars 251 forks source link

re: Neural Engine vs. GPU #42

Open roozmahdavian opened 5 years ago

roozmahdavian commented 5 years ago

Huge fan of machinethink.net. Sorry to ask you this here, but wasn't sure of the best way to do so - you'd recently mentioned that you noticed CoreML inference falls back to GPU if there's a custom layer in the model; how did you know if it was deployed to the Neural Engine in the first place?

also, do you know if Metal Performance Shaders can access the Neural Engine? I ask because it seems that the MPS Graph API, which now supports both training and inference through a very Tensorflow-like fashion, seems to be the most low-level, robust abstraction Apple offers for custom ML, and it's quite bizarre that it appears so strictly bound to the GPU. It makes sense in that exists in Metal, but you'd really think that Apple would offer a computational graph API that sits above both BNNS (for CPU) and MPS (for GPU) and whatever specific instructions the Neural Engine supports, thus allowing you to express a model, train it, and run it across all three architectures depending on power and speed targets. It appears that CoreML now supports this (which is wondering how you knew the model was specifically running on the Neural Engine) but I find to be bizarre that if you actually decide to express and train a complex model in MPS, which is the only way to do it at the moment (and what CreateML is built off of) that you're then stuck, in a sense, to the GPU. I feel like I'm missing something here.

hollance commented 5 years ago

If you run the app and then pause it, you’ll see that CoreML is using something called ANE, which stands for Apple Neural Engine.

I think the reason MPS supports training is mostly so you can train on the Mac, and that iOS is less of a target for this.

And no, there is no public API for the Neural Engine right now. It wouldn’t be through MPS since that is just for the GPU.

roozmahdavian commented 5 years ago

Ah, I see. I suppose what I don't understand is why Apple chose to create a fully-functional computational graph, and then restrict this abstraction (by embedding it in MPS) to the GPU. It just feels strange. I use Tensorflow often on the CPU 🤷‍♂️

hollance commented 5 years ago

I think it’s more the other way around: they first had MPS kernels for doing convolutions etc. Then they realized they could add a graph API. And then even later they added a backwards pass to this API. But it’s not nearly of the same scope as TensorFlow.