huggingface / swift-coreml-diffusers

Swift app demonstrating Core ML Stable Diffusion
Apache License 2.0
2.52k stars 211 forks source link

Some models have a performance degradation of approximately 20 times on the iPhone Pro Max compared to running on M1 Pro Macbook while standard Stable Diffusion 2.1 is just ~1.5x degradation on iPhone. #27

Open jo32 opened 1 year ago

jo32 commented 1 year ago

I have tested the following models on my iPhone 14 Pro Max:

1: coreml-stable-diffusion-2-1-base https://huggingface.co/pcuenq/coreml-stable-diffusion-2-1-base https://huggingface.co/pcuenq/coreml-stable-diffusion-2-1-base/blob/main/coreml-stable-diffusion-2-1-base_split_einsum_compiled.zip

took ~15s on Mackbook M1 Pro took ~ 20s on iPhone 14 Pro Max

2: coreml-8528-diffusion https://huggingface.co/coreml/coreml-8528-diffusion https://huggingface.co/coreml/coreml-8528-diffusion/blob/main/split_einsum/8528-diffusion_split-einsum_compiled.zip

took ~15s on Mackbook M1 Pro took ~5min on iPhone 14 Pro Max

and the memory usage is alot more than the first model.

here is my configuration:


#if targetEnvironment(macCatalyst)
let runningOnMac = true
#else
let runningOnMac = false
#endif

let configuration = MLModelConfiguration()
configuration.computeUnits = runningOnMac ? .cpuAndGPU : .cpuAndNeuralEngine
let pipeline = try StableDiffusionPipeline(resourcesAt: url,
configuration: configuration,
disableSafety: false,
reduceMemory: !runningOnMac)

var config = StableDiffusionPipeline.Configuration(
    prompt: "string1"
)
config.negativePrompt = "string2"
config.stepCount = numInferenceSteps // 15
config.seed = UInt32(seed) // 32
config.guidanceScale = Float(guidanceScale) // 7.5
config.disableSafety = disableSafety // true
config.schedulerType = .dpmSolverMultistepScheduler
pcuenca commented 1 year ago

Hi @jo32, sorry for the delay in getting back to you. Are those results consistent across runs? I've seen some erratic behaviour on iPhone myself – sometimes it's very fast and sometimes it's slow, with no changes in model or configuration whatsoever. When generation is slow, the phone heats up a lot. I believe in those cases the ANE is not engaged and all the work is done on CPU. It could be because of additional system load, memory pressure or other reasons.

This is my go-to list when that happens:

I'll take a look at that particular model you mention to see if I can replicate the issue.

jo32 commented 1 year ago

Hi @jo32, sorry for the delay in getting back to you. Are those results consistent across runs? I've seen some erratic behaviour on iPhone myself – sometimes it's very fast and sometimes it's slow, with no changes in model or configuration whatsoever. When generation is slow, the phone heats up a lot. I believe in those cases the ANE is not engaged and all the work is done on CPU. It could be because of additional system load, memory pressure or other reasons.

This is my go-to list when that happens:

  • Detach from Xcode. I think I've observed slowness more frequently when running the app from Xcode, but I'm not positive that could be a factor.
  • Wait a few minutes for the device to come back to normal temperatures.

I'll take a look at that particular model you mention to see if I can replicate the issue.

@pcuenca

Thank you for your replay and yes, this problem persist, actually this model consistently crashes on my iPhone 14 Pro Max no matter whether I detached the iPhone or cool it down: https://huggingface.co/coreml/coreml-8528-diffusion/blob/main/split_einsum/8528-diffusion_split-einsum_compiled.zip

icer1 commented 1 year ago

https://liuliu.me/eyes/stretch-iphone-to-its-limit-a-2gib-model-that-can-draw-everything-in-your-pocket/

@liuliu

Liu Liu should have already solved this problem. "draw things" does not have this issue.

btsend commented 1 year ago

I have tested the following models on my iPhone 14 Pro Max:

1: coreml-stable-diffusion-2-1-base https://huggingface.co/pcuenq/coreml-stable-diffusion-2-1-base https://huggingface.co/pcuenq/coreml-stable-diffusion-2-1-base/blob/main/coreml-stable-diffusion-2-1-base_split_einsum_compiled.zip

took ~15s on Mackbook M1 Pro took ~ 20s on iPhone 14 Pro Max

2: coreml-8528-diffusion https://huggingface.co/coreml/coreml-8528-diffusion https://huggingface.co/coreml/coreml-8528-diffusion/blob/main/split_einsum/8528-diffusion_split-einsum_compiled.zip

took ~15s on Mackbook M1 Pro took ~5min on iPhone 14 Pro Max

and the memory usage is alot more than the first model.

here is my configuration:

#if targetEnvironment(macCatalyst)
let runningOnMac = true
#else
let runningOnMac = false
#endif

let configuration = MLModelConfiguration()
configuration.computeUnits = runningOnMac ? .cpuAndGPU : .cpuAndNeuralEngine
let pipeline = try StableDiffusionPipeline(resourcesAt: url,
configuration: configuration,
disableSafety: false,
reduceMemory: !runningOnMac)

var config = StableDiffusionPipeline.Configuration(
    prompt: "string1"
)
config.negativePrompt = "string2"
config.stepCount = numInferenceSteps // 15
config.seed = UInt32(seed) // 32
config.guidanceScale = Float(guidanceScale) // 7.5
config.disableSafety = disableSafety // true
config.schedulerType = .dpmSolverMultistepScheduler

@jo32 Could you share your iOS app project? How are you compiling your app with this config and running it? There is no released iOS app or documentation on how to compile the iOS app and run on iPhones in this repo.