AnimeGAN v2 performance, ML Program conversion and performance tuning

JacopoMangiavacchi commented 2 years ago

I was able to convert the AnimeGANv2 model to CoreML and targeting NeuralNetwork I'm even having good performance on M1 hosts.

As I still have some performance issues using this CoreML model on A* devices for live videos I tried to convert the model to the new ML Program (target.iOS15) testing both with and without FP16ComputePrecision conversion.

Unfortunately I found out that with ML Program I have even worst performance, also on M1.

I wonder if you have any suggestions about how to instrument and tune performance of CoreML model in general. My goal is to try to understand bottlenecks in the model layers/ops and eventually see how to change and optimize the model architecture for better CoreML performance.

Thank you very much for any suggestions.

anilkatti commented 2 years ago

Hello, since this seems to be related to CoreML Framework, could you please file a bug report on http://feedbackassistant.apple.com? Thanks!

JacopoMangiavacchi commented 2 years ago

@anilkatti feedback id is FB9775592

chinsyo commented 2 years ago

I was able to convert the AnimeGANv2 model to CoreML and targeting NeuralNetwork I'm even having good performance on M1 hosts.

As I still have some performance issues using this CoreML model on A* devices for live videos I tried to convert the model to the new ML Program (target.iOS15) testing both with and without FP16ComputePrecision conversion.

Unfortunately I found out that with ML Program I have even worst performance, also on M1.

I wonder if you have any suggestions about how to instrument and tune performance of CoreML model in general. My goal is to try to understand bottlenecks in the model layers/ops and eventually see how to change and optimize the model architecture for better CoreML performance.

Thank you very much for any suggestions.

@JacopoMangiavacchi I also tried to convert AnimeGANv2 to CoreML model, and use NeuralNetwork type mlmodel to modify the input and output to ImageType.

In my case (Xcode 13.1, deployment target 14.0, iOS 15.1, iPhone 11 PM), the data loaded by AnimeGANv2 MLModel (Model Type: NeuralNetwork, Compute: FP16, Storage: FP32) is as follows:

MLComputeUnits.cpuOnly, it only takes about 0.1 seconds to load the model, the memory usage increases by about 300Mb, and it takes about 4 seconds to test a 512x512 image.
MLComputeUnits.all, it takes about 4 seconds to load the model, the memory usage increases by about 800Mb, and it takes about 4 seconds to test a 512x512 image.

Does my memory usage and loading time-consuming data meet expectations? I hope to get your guidance, I am very grateful.

JacopoMangiavacchi commented 2 years ago

@chinsyo yes these are the overall performances I have on a iPhone11. I’m looking for more detailed coreml profile info in order to understand layer bottlenecks and eventually distill a simplified network architecture.

chinsyo commented 2 years ago

@chinsyo yes these are the overall performances I have on a iPhone11. I’m looking for more detailed coreml profile info in order to understand layer bottlenecks and eventually distill a simplified network architecture.

@JacopoMangiavacchi Thanks you very much! Is there any possible way for me to be informed about the progress of your work on this issue?

TobyRoseman commented 2 years ago

I'm going to close this GitHub issue, since it's not related to the coremltools python package and we already have an internal issue tracking it.

apple / coremltools

AnimeGAN v2 performance, ML Program conversion and performance tuning #1352