Closed wadie999 closed 2 weeks ago
Hey.
Hey Denk, thank you for the insights I was wondering, is it possible to mix images without prompts ? I see that even if you take 2 images as inputs, the coca model create captions that serves as prompt to guide the generation. Do you think the diffusion process can rely on image embeddings in latent space ? using a vision encoder
Hi @wadie999 :) If I understand you correctly there is a jupyter-notebook example which does not use coca model. So you can pass empty string. Also if you interested in mix images you try to use Kandinsky model for this purpose.
Hello, very interesting project. Is there any paper or deeper explanation of mixing process ? Is there a way of mixing 2 images but preserving informations from an image when mixing ? for example mixing a QR code with an image but preserving the QR Code modules.