apple / ml-stable-diffusion

Stable Diffusion with Core ML on Apple Silicon
MIT License
16.69k stars 929 forks source link

Control Net image generation command fails #195

Closed dapurv5 closed 1 year ago

dapurv5 commented 1 year ago

I converted the control net mlsd model and then ran the generation command as

python -m python_coreml_stable_diffusion.pipeline --prompt "a cartoon rendition of a book" -i /Users/_ImageSynthesis/coreml_stable_diffusion_out_ml_packages_sd-controlnet-mlsd -o /Users/verapurv/Pictures/coreml_stable_diffusion_out_ml_packages --compute-unit ALL --seed 1729 --model-version runwayml/stable-diffusion-v1-5 --controlnet lllyasviel/sd-controlnet-mlsd --controlnet-inputs ~/Pictures/CompressJPEG.online_512x512_image.png

but I see the following error

eml_stable_diffusion_out_ml_packages_sd-controlnet-mlsd/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_safety_checker.mlpackage
INFO:python_coreml_stable_diffusion.coreml_model:Done. Took 1.1 seconds.
INFO:__main__:Done.
INFO:__main__:Initializing Core ML pipe for image generation
INFO:__main__:Stable Diffusion configured to generate 512x512 images
INFO:__main__:Done.
INFO:__main__:Beginning image generation.
  0%|                                                                     | 0/51 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/Users/verapurv/anaconda3/envs/coreml_stable_diffusion/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/verapurv/anaconda3/envs/coreml_stable_diffusion/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/verapurv/ProgramFiles/ml-stable-diffusion/python_coreml_stable_diffusion/pipeline.py", line 657, in <module>
    main(args)
  File "/Users/verapurv/ProgramFiles/ml-stable-diffusion/python_coreml_stable_diffusion/pipeline.py", line 577, in main
    image = coreml_pipe(
  File "/Users/verapurv/ProgramFiles/ml-stable-diffusion/python_coreml_stable_diffusion/pipeline.py", line 395, in __call__
    noise_pred = self.unet(
  File "/Users/verapurv/ProgramFiles/ml-stable-diffusion/python_coreml_stable_diffusion/coreml_model.py", line 79, in __call__
    return self.model.predict(kwargs)
  File "/Users/verapurv/anaconda3/envs/coreml_stable_diffusion/lib/python3.8/site-packages/coremltools/models/model.py", line 569, in predict
    return self.__proxy__.predict(data)
RuntimeError: {
    NSLocalizedDescription = "Error computing NN outputs.";
}

Any idea what might be going wrong?

atiorh commented 1 year ago

Hey @dapurv5, I recommend reviewing Q2 from our FAQ for this issue. There could be many reasons for this issue but the most likely one in this project's context is that your system runs out of memory due to other processes taking most of the RAM especially on lower RAM devices. Please let me know if reducing system load fixes your issue. Finally, I recommend applying compression (--quantize-nbits 6) to the models that you generate or using the compressed models from Hugging Face Hub if you are not already doing so which will reduce the amount of memory required by this model by 63%.

atiorh commented 1 year ago

One more note: I recommend using the Swift CLI for higher memory efficiency.

jrittvo commented 1 year ago

I have been playing with the compressed models from HF and some self-converted ones as well. Can you tell us if the full set of advantages in using these models is supported in macOS 13 with apps built with Xcode 14, or do some pieces require macOS 14 / Xcode 15? (As a side note, a fresh conversion environment built off of ml-stable-diffusion 1.0.0 that pulls in diffusers 0.17.1 is working perfectly. Bravo!)

atiorh commented 1 year ago

Thanks for the note @jrittvo! The runtime memory savings are dependent on target device software version. macOS13 will still save disk space and download size for quantized models but macOS 14 is required for runtime RAM savings and the reduce latency benefit.

jrittvo commented 1 year ago

The space savings with a 6-bit ControlNet model is huge, 272 MB vs 723 MB for an uncompressed model. The 6-bit ControlNets seem to work equally well with a 6-bit base model or a FP16 base model. And the results look essentially the same to my eye when compared to using an uncompressed ControlNet model.

dapurv5 commented 1 year ago

I tried the --quantize-nbits 6 option but I still continue to see the same error. I have a M1 Pro with 16GB of memory. Anything else I can try out.

atiorh commented 1 year ago

@dapurv5 Do you mind trying the Swift CLI command? Were you able to check your system memory usage for any outliers from other apps? (For example through Activity Monitor's Memory tab)

dapurv5 commented 1 year ago

I made sure to close all other applications on my computer, including the ones running in the background that I knew about. Could you kindly provide me with the swift cli command that corresponds to the python command I shared earlier?

dapurv5 commented 1 year ago

The max memory usage doesn't go over 13GB so I still have 3GB for free memory available.

Screenshot 2023-06-16 at 5 16 05 PM
jrittvo commented 1 year ago

Review your -i and --controlnet arguments in your generation command. I think -i should be pointing to a base model that has already been converted and that includes a ControlledUnet.mlmodelc component, and that is saved locally on your machine. I think --controlnet should point to an already converted ControlNet model that is present on your machine. It is hard to tell just from the command you pasted in, but I think both of those arguments are not naming the correct pieces. The suggestion to try with a Swift CLI is also a very good one. It is much simpler in Swift - just a single long command. In Python, you need a multi-line script. Are you using a Python script or just trying to use that command?

jrittvo commented 1 year ago

Without using a ControlNet, this is my command:

python gen-sd21.py --prompt "photograph of a cat" --seed "12" --save "images/cat12.jpg"

And this is the script (gen-sd21.py) that my command calls:

import torch import tomesd import argparse

parser = argparse.ArgumentParser() parser.add_argument("--prompt") parser.add_argument("--save") parser.add_argument("--seed") args = parser.parse_args()

from diffusers import StableDiffusionPipeline pipe = StableDiffusionPipeline.from_pretrained("models/sd21") ## path to model

pipe = pipe.to("mps") pipe.enable_attention_slicing() tomesd.apply_patch(pipe, ratio=0.4) ## tomesd.remove_patch(pipe)

from diffusers import EulerDiscreteScheduler ## DPMSolverMultistepScheduler ## PNDM if not specified scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config) pipe.scheduler = scheduler

seed = int(args.seed) generator = [torch.Generator(device="mps").manual_seed(seed)]

image = pipe(args.prompt, generator=generator, num_inference_steps=24, guidance_scale=7.5, height=768, width=768).images[0]

file_path = (args.save) image.save(file_path) ## image.save("images/image.jpg")

jrittvo commented 1 year ago

The same thing for me in Swift, and with a ControlNet:

swift run StableDiffusionSample "a photo of a cat" --seed 12 --guidance-scale 8.0 --step-count 24 --image-count 1 --scheduler dpmpp --compute-units cpuAndGPU --resource-path ../models/SD15-5x7 --controlnet Canny-5x7 --controlnet-inputs ../input/dog-5x7.png --output-path ../images

jrittvo commented 1 year ago

A number of the arguments in my Swift example can be left off and defaults will get used instead.

jrittvo commented 1 year ago

Maybe your python_coreml_stable_diffusion.pipelinedoes the same as my calling my own script, but I still believe you need to get those 2 arguments straightened out. I hope I am not making things more complicated for you. I am not a Python person.

And my pipeline in Python is using diffusers, while yours is CoreMl, so I probably am confusing you on the Python side. I'm sorry. But do try it with Swift.

dapurv5 commented 1 year ago

Thank you for providing me with clear instructions. However, I am currently experiencing an issue with locating the .modelc file during the conversion process. I only see .mlmodel files but not the .mlmodelc files

(devbox) ☁  coreml_stable_diffusion_out_ml_packages_sd-controlnet-mlsd  find . | grep "mlmodelc"
(devbox) ☁  coreml_stable_diffusion_out_ml_packages_sd-controlnet-mlsd  find . | grep "mlmodel"
./Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_vae_decoder.mlpackage/Data/com.apple.CoreML/model.mlmodel
./Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_text_encoder.mlpackage/Data/com.apple.CoreML/model.mlmodel
./ControlNet_lllyasviel_sd-controlnet-mlsd.mlpackage/Data/com.apple.CoreML/model.mlmodel
./Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_control-unet.mlpackage/Data/com.apple.CoreML/model.mlmodel
./Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_safety_checker.mlpackage/Data/com.apple.CoreML/model.mlmodel

This is the command I used for conversion.

python -m python_coreml_stable_diffusion.torch2coreml --convert-unet --convert-text-encoder --convert-vae-decoder -o /Users/coreml_stable_diffusion_out_ml_packages_sd-controlnet-canny --model-version runwayml/stable-diffusion-v1-5 --convert-safety-checker --convert-controlnet lllyasviel/sd-controlnet-canny --unet-support-controlnet --quantize-nbits 6
jrittvo commented 1 year ago

Yes. Your conversion command is mixing together the arguments for converting a base model for use with a ControlNet model with the command for converting a ControlNet model itself. You need 2 conversions to get 2 models.

Let's start with the ControlNet conversion first. It is simpler.

These are my instructions to self, and I have it making a bundle to use with Swift CLI. You should really use Swift for this. You'll get better help in an Apple repo if you use Swift CLI.


python -m python_coreml_stable_diffusion.torch2coreml --convert-controlnet lllyasviel/control_v11p_sd15_softedge --model-version "runwayml/stable-diffusion-v1-5" --bundle-resources-for-swift-cli --attention-implementation ORIGINAL --latent-w 64 --latent-h 64 --compute-unit CPU_AND_GPU -o "./SoftEdge"

Adjust --attention-implementation, --compute-unit, --latent-h, --latent-w, and -o to fit the type of model you want to end up with. These need to match the main model you plan to use them with. You can't mix sizes, but you can sometimes mix attention implementations.

The --convert-controlnet argument needs to point to the ControlNet model's specific repo. There is an index of these repos at: https://huggingface.co/lllyasviel. The name for the argument needs to be in the form of: lllyasviel/control_Version_Info_name

The final converted model will be the .mlmodelc file in the Model_Name/Resources/controlnet folder.


Give this first conversion a shot and we can goon to the other one after.

jrittvo commented 1 year ago

By the way, I have a small public repo at Hugging Face with some base models, for use with ControlNets, already converted to CoreML, and a bunch of ControlNet models also already converted to CoreML. If you just want to be able to do generation, you can grab the models you want and skip the whole converting process. But the models again are for Swift and CoreML. I never tried them with Python and CoreML.

https://huggingface.co/jrrjrr/CoreML-Models-For-ControlNet

I will be adding some 6-bit versions in the next week or two, but you should be fine memory wise once you get the right generation command and models in place, even with the standard FP16 models.

jrittvo commented 1 year ago

I need to update the repo to say everything also works with ml-stable-diffusion 1.0.0. It says 0.4.0 at present.

jrittvo commented 1 year ago

The .mlpackage files you got with your conversion would be for use with Python and CoreML. The --bundle-resources-for-swift-cli causes them to also be saved as .mlmodelc, for use with Swift. You'll get both sets with my conversion command. The Swift set will be in a "resources" sub-folder, and the files along side the resources folder are the other type. That is all for the base model conversion. There are fewer file either way for the ControlNet model conversion, if you are doing that one first. I'm getting ahead of things.

dapurv5 commented 1 year ago

Thank you for your help. I will make use of the models that you have pre-converted.