Open nkpkg23 opened 1 year ago
Hi!
Thank you for the clarifications! I also wanted to check if I'm using the right approach for the inpainting functionality on iOS 17, since I am currently facing a memory leak whenever cgImage.plannerRGBShapedArray() is called on controlNetInputs.
The application's Documents and Data size grows every time the application is used. In the intermediate generation images, I can see that it starts inpainting inside the masked region of the original image, but it crashes every time after a few steps. (However, I don't have any issues with just generating images with palettized stable diffusion. This only occurs when I add inpainting with controlnet). Are there any other steps I am missing in order for inpainting to work? Or is this the wrong format for the controlnet inputs? Any tips would be helpful, thanks!
The details for inpainting are: I am using palettized runwayml/stable-diffusion-v1-5 with controlnet (lllyasviel/control_v11p_sd15_inpaint model) on iOS 17. For the controlnet input, I set the image size as 512x512 and set the pixels I wanted to inpaint in the original image as transparent. When instantiating the StableDiffusionPipeline, I included the following configurations and parameter for controlnet:
let configuration = MLModelConfiguration()
configuration.computeUnits = .cpuAndNeuralEngine
pipeline = try StableDiffusionPipeline(resourcesAt: url!, controlNet: ["LllyasvielControlV11PSd15Inpaint"], configuration: configuration, disableSafety: false, reduceMemory: true)
} catch let error {
print(error.localizedDescription)
}
And, then inside a generateImage() function, this is how I added the controlnet inputs:
var pipelineConfig = StableDiffusionPipeline.Configuration(prompt: prompt)
if let controlNetInput = UIImage(named: “images/sample.png“)?.cgImage {
pipelineConfig.controlNetInputs = [controlNetInput]
} else {
print("Error loading control net input image")
}
pipelineConfig.schedulerType = StableDiffusionScheduler.dpmSolverMultistepScheduler
images = try pipeline?.generateImages(configuration: pipelineConfig, progressHandler: {......
Hi, I'm new to running Stable Diffusion on iOS and I have two clarification questions regarding using this repository–
1) Am I able to run palettized stable diffusion models on iOS 16? In this article (https://huggingface.co/blog/fast-diffusers-coreml) from Huggingface, there was a note that mentioned that "In order to use 6-bit models, you need the development versions of iOS/iPadOS 17". However, in this video (https://developer.apple.com/videos/play/wwdc2023/10047/), there was a slide mentioning that there is support in iOS 16 for compressed models (sparse weights, quantized weights, palettized weights). What kinds of compressed models can I run on iOS 16 without needing the development version of iOS 17?
2) I am also testing out some variants of Stable Diffusion with compressed architectures (e.g. nota-ai/bk-sdm-tiny) that removes several residual and attention blocks from the U-Net. When I ran the torch2coreml script, I was getting an error on assert mid_block_type == "UNetMidBlock2DCrossAttn". It appears that I'll need to modify the python_coreml_stable_diffusion/unet.py code to be compatible with the new architecture. Are there any other files that I'll need to modify in order for the script to work with such variants of Stable Diffusion?