apple / ml-stable-diffusion

Stable Diffusion with Core ML on Apple Silicon
MIT License
16.67k stars 923 forks source link

Swift generation produces different style/quality images compared to other SD tools #90

Open kasima opened 1 year ago

kasima commented 1 year ago

I've been experimenting with image generation in Swift with the converted CoreML models. It seems to produce different style (and noticeably worse?) images than other Stable Diffusion tools for a given model version and set of generation parameters. The python CLI generation with the converted CoreML models seems to produce images that are in the same vicinity as the other tools.

I'm new to AI image generation space and would much appreciate any help with a few questions:

Here's what I've been looking at:

Parameters

Tool
DreamStudio 2572197474_personification_of_Halloween_holiday_in_the_form_of_a_cute_girl_with_short_hair_and_a_villain_s_smil 2981633817_personification_of_Halloween_holiday_in_the_form_of_a_cute_girl_with_short_hair_and_a_villain_s_smil 1465467626_personification_of_Halloween_holiday_in_the_form_of_a_cute_girl_with_short_hair_and_a_villain_s_smil 1985145463_personification_of_Halloween_holiday_in_the_form_of_a_cute_girl_with_short_hair_and_a_villain_s_smil
Google Colab download download (2) download (1) download (3)
DiffusionBee (local) 4 3 2 1
[InvokeAI]() (local) 000037 50a0c616 94 000036 b26068ab 430981462 000035 bdde6fd3 3918484893 000034 15d8177c 93
python CLI (local coreML) randomSeed_96_computeUnit_ALL_modelVersion_runwayml_stable-diffusion-v1-5 randomSeed_95_computeUnit_ALL_modelVersion_runwayml_stable-diffusion-v1-5 randomSeed_94_computeUnit_ALL_modelVersion_runwayml_stable-diffusion-v1-5 randomSeed_93_computeUnit_ALL_modelVersion_runwayml_stable-diffusion-v1-5
swift CLI (local coreML) personification_of_Halloween_holiday_in_the_form_of_a_cute_girl_with_short_ 3 93 gs10 30 personification_of_Halloween_holiday_in_the_form_of_a_cute_girl_with_short_ 2 93 gs10 30 personification_of_Halloween_holiday_in_the_form_of_a_cute_girl_with_short_ 1 93 gs10 30 personification_of_Halloween_holiday_in_the_form_of_a_cute_girl_with_short_ 0 93 gs10 30
Python CLI generation command

pre-converted model from huggingface

python -m python_coreml_stable_diffusion.pipeline --prompt "personification of Halloween holiday in the form of a cute girl with short hair and a villain's smile, cute hats, cute cheeks, unreal engine, highly detailed, artgerm digital illustration, woo tooth, studio ghibli, deviantart, sharp focus, artstation, by Alexei Vinogradov bakery, sweets, emerald eyes" -i /Users/kasima/src/huggingface/apple/coreml-stable-diffusion-v1-5/original/packages -o /Users/kasima/scratch --compute-unit ALL --model-version "runwayml/stable-diffusion-v1-5" --num-inference-steps 30 --guidance-scale 10

Swift CLI generation command

pre-converted model from huggingface

swift run StableDiffusionSample "personification of Halloween holiday in the form of cute girl with short hair and a villain's smile, cute hats, cute cheeks, unreal engine, highly detailed, artgerm digital illustration, woo tooth, studio ghibli, deviantart, sharp focus, artstation, by Alexei Vinogradov bakery, sweets, emerald eyes" --negative-prompt "" --resource-path /Users/kasima/src/huggingface/apple/coreml-stable-diffusion-v1-5/split_einsum/compiled/ --output-path /Users/kasima/scratch/swiftcli/comparison --step-count 30 --guidance-scale 10 --image-count 4

GuiyeC commented 1 year ago

@kasima maybe this is a silly question but I see no mention of the seeds used to generate these images, did you use the same to generate each column on these examples? and if you did, could you share them so we can try to replicate this results?

H1p3ri0n commented 1 year ago

The issue is that currently this CoreML implementation only supports 2 samplers. The normal SD tools all support other samplers, like Euler, etc, which I found produce great results.

We'll have to wait until apple implement other samplers in this codebase.

kasima commented 1 year ago

@GuiyeC – All the images were generated with random seeds (updated in the original post). The images in the columns aren't necessarily related to each other. Columns were used for formatting. However, it's an interesting idea to try to keep the seeds the same. Will try that when I get a chance.

@timevision – So the samplers/schedulers might have something to do with it? I believe at least a few of them are using the default PNDM scheduler (Google Colab, Python CLI, Swift CLI, and probably Diffusion Bee as well). Will confirm and try regenerating with the same scheduler.