liuliu / swift-diffusion

BSD 3-Clause "New" or "Revised" License
423 stars 33 forks source link

256x256 SD models #13

Closed ClashSAN closed 1 year ago

ClashSAN commented 1 year ago

hi, I was wondering if you've seen this model which (may?) be useful to devices with a smaller memory size. https://huggingface.co/justinpinkney/miniSD My 3gb ipad will crash at the lowest size.

Also, thank you for your lengthy readme, it was interesting to read your experience. I am already interested in model conversion to various formats: Openvino IR format, Jax, ONNX, but specifically I want to what I can improve the cpu only experience. I've only been seeing the openvino besdev repo performing ok results at 8-10 steps. Someone apparently enabled DDIM through diffusers library, but something is wrong, and most pictures produce visible whiteish outlines. Do you have some recommendations on how to fix this? If DDIM sampler works ok, perhaps I'll eventually figure out how to use this conversion script for IR format to convert cooler models

liuliu commented 1 year ago

TBH, none of these samplers are hard, I would just implement these samplers myself to see if there are any weirdness with openvino. I have some sample code throw around, for PLMS, just check here: https://github.com/liuliu/swift-diffusion/blob/liu/nhwc/examples/txt2img/main.swift#L146, for DDIM, check this: https://github.com/liuliu/swift-diffusion/blob/liu/unet/examples/txt2img/main.swift#L206 and also see the commented out code for DPM++ Karras as well as Euler A.

For miniSD, yeah, it is interesting, I will integrate next week, but I doubt it will help 3GiB since the model seems crash when loading the weights (1.6GiB), but we can give it a try after integration (mainly interested in this as it can go really fast, probably 5~10s on iPhone).

ClashSAN commented 1 year ago

Thank you very much!

mainly interested in this as it can go really fast, probably 5~10s on iPhone

Really nice.

I see. currently Draw Things does its lowest at 384x384, so I also thought a lower size generation may help, but that's not it, it's the model being loaded