Feature request: support image inputs

apple / ml-stable-diffusion

Stable Diffusion with Core ML on Apple Silicon

MIT License

16.74k stars 934 forks source link

Feature request: support image inputs #27

Open darknoon opened 1 year ago

darknoon commented 1 year ago

I think we would need to convert the Encoder as well as the Decoder, looking into it but curious if anyone was already working on this

asadm commented 1 year ago

I wonder if this PR on maple diffusion is helpful: https://github.com/mortenjust/native-diffusion/pull/7

I am highly interested in this but lack the right skills here.

joogps commented 1 year ago

That would be so great!

littleowl commented 1 year ago

I finally got image2image working! It took good deal of time, and I'll need to clean it up and test on other devices before submitting a PR or forking. Essentially, needed to convert the Encoder as well, but also bake in the DiagonalGaussianDistribution operation as well as some of the other code from the pipeline and adding noise from the right timeStep. This way all the operations to generate the latents are in one model.

UXDart commented 1 year ago

is this possible? also it is possible to do inpainting?

littleowl commented 1 year ago

Image2Image PR: https://github.com/apple/ml-stable-diffusion/pull/73