Converting DreamShaper_8 results in invalid model

MenKuch commented 9 months ago

Hi!

I am trying to convert digiplay/DreamShaper_8 to CoreML. I am using…

python -m python_coreml_stable_diffusion.torch2coreml --convert-unet --convert-text-encoder --convert-vae-decoder --convert-safety-checker --model-version digiplay/DreamShaper_8 -o <OUTPUTNAME> --bundle-resources-for-swift-cli

…to convert the model. While the command succeeds, trying inference using the (unmodified) sample project from the CLI like…

./StableDiffusionSample --resource-path <PATHTOMODEL> --compute-units cpuAndGPU --scheduler dpmpp "car"

…always results in…

Loading resources and creating pipeline (Note: This can take a while the first time using these resources) Step 19 of 20 [mean: 0.91, median: 0.98, last 0.97] step/sec Error: MultiArray shape (1 x 512 x 512 x 3) does not match the shape (1 x 768 x 768 x 3) specified in the model description

The only thing I am noticing during convert is the following output:

scikit-learn version 1.3.1 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.

What am I doing wrong? Or is this a bug?

atiorh commented 9 months ago

Looks like there is an inconsistent latent dimension that results in an inconsistent output resolution. Can you try reexporting with --latent-h 64 --latent-w 64 to make sure HF hub config files are overridden? (This will enforce 64x64 for latents and 512x512 for the output space)

MenKuch commented 9 months ago

Thanks for your answer! I tried converting the model using

python -m python_coreml_stable_diffusion.torch2coreml --convert-unet --convert-text-encoder --convert-vae-decoder --convert-safety-checker --model-version digiplay/DreamShaper_8 -o <OUTPUTNAME> --bundle-resources-for-swift-cli --latent-h 64 --latent-w 64

and inference with

./StableDiffusionSample --resource-path <PATHTOMODEL> --compute-units cpuAndGPU --scheduler dpmpp "car"

and it now WORKS! THANKS!

But: omitting the cpuAndGPU flag so the model is run with the ANE sadly results in:

Error: Unable to compute the asynchronous prediction using ML Program. It can be an invalid input data or broken/unsupported model.

Tested this on a MacBook Pro M1 Max with macOS Sonoma 14.0 (23A344). StableDiffusionSample has been compiled with Xcode 15.

atiorh commented 9 months ago

Great! For ANE inference, you will need --compute-units cpuAndNeuralEngine

MenKuch commented 8 months ago

Awesome, that works great. Thanks for your support. My models now run on CPU+GPU and ANE (if enough RAM is present).

But: Is it normal that on every app relaunch, ANECompilerService takes approx. 1 to 3 minutes to compile my model for ANE?

apple / ml-stable-diffusion

Converting DreamShaper_8 results in invalid model #282