CompVis / stable-diffusion

A latent text-to-image diffusion model
https://ommer-lab.com/research/latent-diffusion-models/
Other
66.74k stars 9.98k forks source link

support image variation ? #42

Open heurainbow opened 1 year ago

heurainbow commented 1 year ago

can this diffusion model support image variation without text prompt as dalle2

gregturk commented 1 year ago

You can get something similar to Dall-E 2's image variations if you play around with img2img.py. Feed in the original image that you want to get variations for, use the same text prompt that you used to create your original image, and then play around with the "strength" parameter. Higher strength means more of your original image is overwritten with noise, and so the results will be further from your original.

heurainbow commented 1 year ago

You can get something similar to Dall-E 2's image variations if you play around with img2img.py. Feed in the original image that you want to get variations for, use the same text prompt that you used to create your original image, and then play around with the "strength" parameter. Higher strength means more of your original image is overwritten with noise, and so the results will be further from your original.

This works only when I generate the init-image first, what if I use a real image instead of a generated one?

atarashansky commented 1 year ago

You don't need to generate the initial image. You can use a real image just fine. Just resize it to the same dimensions you wish to output.

heurainbow commented 1 year ago

You don't need to generate the initial image. You can use a real image just fine. Just resize it to the same dimensions you wish to output.

What I need is image translation without conditioning on text prompt, and the generated image must maintain the general semantics of the original image. As in dalle2, the top right image changes into the bottom right one without using the text prompt.

cunicode commented 1 year ago

CLIP image embeddings

ZiboZ commented 1 year ago

@cunicode same issues here! Have you found an appropriate way to do this? 😭

pribadihcr commented 1 year ago

Versatile Diffusion has an image variation task