Open jo32 opened 1 year ago
Hi @jo32, sorry for the delay in getting back to you. Are those results consistent across runs? I've seen some erratic behaviour on iPhone myself – sometimes it's very fast and sometimes it's slow, with no changes in model or configuration whatsoever. When generation is slow, the phone heats up a lot. I believe in those cases the ANE is not engaged and all the work is done on CPU. It could be because of additional system load, memory pressure or other reasons.
This is my go-to list when that happens:
I'll take a look at that particular model you mention to see if I can replicate the issue.
Hi @jo32, sorry for the delay in getting back to you. Are those results consistent across runs? I've seen some erratic behaviour on iPhone myself – sometimes it's very fast and sometimes it's slow, with no changes in model or configuration whatsoever. When generation is slow, the phone heats up a lot. I believe in those cases the ANE is not engaged and all the work is done on CPU. It could be because of additional system load, memory pressure or other reasons.
This is my go-to list when that happens:
- Detach from Xcode. I think I've observed slowness more frequently when running the app from Xcode, but I'm not positive that could be a factor.
- Wait a few minutes for the device to come back to normal temperatures.
I'll take a look at that particular model you mention to see if I can replicate the issue.
@pcuenca
Thank you for your replay and yes, this problem persist, actually this model consistently crashes on my iPhone 14 Pro Max no matter whether I detached the iPhone or cool it down: https://huggingface.co/coreml/coreml-8528-diffusion/blob/main/split_einsum/8528-diffusion_split-einsum_compiled.zip
@liuliu
Liu Liu should have already solved this problem. "draw things" does not have this issue.
I have tested the following models on my iPhone 14 Pro Max:
1: coreml-stable-diffusion-2-1-base https://huggingface.co/pcuenq/coreml-stable-diffusion-2-1-base https://huggingface.co/pcuenq/coreml-stable-diffusion-2-1-base/blob/main/coreml-stable-diffusion-2-1-base_split_einsum_compiled.zip
took ~15s on Mackbook M1 Pro took ~ 20s on iPhone 14 Pro Max
2: coreml-8528-diffusion https://huggingface.co/coreml/coreml-8528-diffusion https://huggingface.co/coreml/coreml-8528-diffusion/blob/main/split_einsum/8528-diffusion_split-einsum_compiled.zip
took ~15s on Mackbook M1 Pro took ~5min on iPhone 14 Pro Max
and the memory usage is alot more than the first model.
here is my configuration:
#if targetEnvironment(macCatalyst) let runningOnMac = true #else let runningOnMac = false #endif let configuration = MLModelConfiguration() configuration.computeUnits = runningOnMac ? .cpuAndGPU : .cpuAndNeuralEngine let pipeline = try StableDiffusionPipeline(resourcesAt: url, configuration: configuration, disableSafety: false, reduceMemory: !runningOnMac) var config = StableDiffusionPipeline.Configuration( prompt: "string1" ) config.negativePrompt = "string2" config.stepCount = numInferenceSteps // 15 config.seed = UInt32(seed) // 32 config.guidanceScale = Float(guidanceScale) // 7.5 config.disableSafety = disableSafety // true config.schedulerType = .dpmSolverMultistepScheduler
@jo32 Could you share your iOS app project? How are you compiling your app with this config and running it? There is no released iOS app or documentation on how to compile the iOS app and run on iPhones in this repo.
I have tested the following models on my iPhone 14 Pro Max:
1: coreml-stable-diffusion-2-1-base https://huggingface.co/pcuenq/coreml-stable-diffusion-2-1-base https://huggingface.co/pcuenq/coreml-stable-diffusion-2-1-base/blob/main/coreml-stable-diffusion-2-1-base_split_einsum_compiled.zip
took ~15s on Mackbook M1 Pro took ~ 20s on iPhone 14 Pro Max
2: coreml-8528-diffusion https://huggingface.co/coreml/coreml-8528-diffusion https://huggingface.co/coreml/coreml-8528-diffusion/blob/main/split_einsum/8528-diffusion_split-einsum_compiled.zip
took ~15s on Mackbook M1 Pro took ~5min on iPhone 14 Pro Max
and the memory usage is alot more than the first model.
here is my configuration: