apple / ml-stable-diffusion

Stable Diffusion with Core ML on Apple Silicon
MIT License
16.76k stars 933 forks source link

Support more aspect ratios #105

Open sindresorhus opened 1 year ago

sindresorhus commented 1 year ago

Other Stable Diffusion interfaces like https://dreamlike.art/create support aspect ratios like 4:3, 3:4, etc.

pj4533 commented 1 year ago

+1 I'd love to output 9:16 for mobile uses.

Zabriskije commented 1 year ago

Yup, single-model multi-resolution should be high on the list

pj4533 commented 1 year ago

Yup, single-model multi-resolution should be high on the list

Are other resolutions possible using other models with this repo? I was looking on HuggingFace for a 9:16 model but didn't find anything (kinda new to SD, would love any help getting a 9:16 image)

Zabriskije commented 1 year ago

Are other resolutions possible using other models with this repo? I was looking on HuggingFace for a 9:16 model but didn't find anything (kinda new to SD, would love any help getting a 9:16 image)

Right now it's not possible to select the preferred resolution via GUI, you'll have to hard code the resolution when converting a model, and it only works for original (CPU+GPU). But check out Core ML Models community on Hugging Face, we're uploading some models in 512x768. Check out also Mochi Diffusion, the best GUI right now imho.

mwmeyer commented 1 year ago

I was able to achieve 512x768 images for Diffusion Wallpaper by modifying the torch2coreml.py as described here: https://github.com/apple/ml-stable-diffusion/issues/64#issuecomment-1375013357

pj4533 commented 1 year ago

Are other resolutions possible using other models with this repo? I was looking on HuggingFace for a 9:16 model but didn't find anything (kinda new to SD, would love any help getting a 9:16 image)

Right now it's not possible to select the preferred resolution via GUI, you'll have to hard code the resolution when converting a model, and it only works for original (CPU+GPU). But check out Core ML Models community on Hugging Face, we're uploading some models in 512x768. Check out also Mochi Diffusion, the best GUI right now imho.

Hmmmmm, I have mostly been using the image2image PR, with my own custom CLI. So I think I'd need to recompile the model using that torch2coreml patch, hard coding the image sizes? (otherwise I wouldn't have the VAEEncoder?) Not really sure tho...prob should just wait till that PR is merged.