Closed RahulBhalley closed 2 years ago
This is not a feature request for coremltools. This is a feature request for the Core ML framework.
Please submit your feature request here: https://feedbackassistant.apple.com/
There's no option for CoreML but CreateML. I also don't feel confident that I'll get any replies there because I never got any. It's a really important feature but I'll just drop it and do it manually than just trying to get updates from Apple.
Thank you.
🌱 Describe your Feature Request
Since deep learning is concerned with [very] deep neural network functions, we want to be able to run them on edge devices like iPhone. The speed is not an issue (say, an inference for large ~350 convolution layers model takes only about 1.5 seconds) but the memory is an issue (unlike Macs which use SSD as extra memory).
Solution ATM
The OOM can be eliminated by breaking up a neural network (
layer0
->layer1
->layer2
->layer...
) into a group of few consecutive layers and create CoreML models of these layer groups. Now each model is loaded and inferred from separately in a different Swift function so its scope is limited and memory is released as soon as the function call is finished. Basically, here we are aggressively releasing memory allocation (manually).The second case where a layer (closer to output layer) has multiple inputs from outputs of multiple layers (closer to input layer) in a DAG, we can simply keep those variables alive (not deallocate them) until the layer (closer to output layer) doesn't go out of scope. Basically, keep the dependency layers in scope until their outputs are consumed by dependent layers in DAG.
It'd be cool if Apple CoreML team can make this a default behaviour behind-the-scenes.
How can this feature be used?
This feature can used to infer with large deep neural networks like StyleGAN variants, transformers, etc.
Describe alternatives you've considered
We simply split the network into group of layers.
Additional context
Do you have anything else to say? No.