Open EricHuiK opened 2 years ago
You are probably inputting an image with an alpha channel like PNG. Try only inputting RGB
Hi @samedii ! I'm trying to write an inference scripts for running this (https://ommer-lab.com/files/latent-diffusion/text2img.zip) txt2image model. I replaced the relative config yml file and model checkpoint. By inputting a text prompt the system gives me the error that @EricHuiK mentioned. In particular, the error arises in the functions called function at line 137 of the inference script samplesddim, = sampler.sample(S=opt.ddim_steps,...).
Do you happen to know where to update the code? :)
I encountered the same error when trying to use this config (models/ldm/text2img256/config.yaml)
I tried to fix it by changing the L136 shape = [4, opt.H//8, opt.W//8]
to shape = [3, opt.H//8, opt.W//8]
.
After this change, the code could run smoothly, no more shape incompatibility.
(however, the result image doesn't look good from this setting . Not sure if there are more things to change.
me too
Has anyone managed to find a solution to this? I am facing the same issue
Hello, indeed the problem is in the shape. If they put in the variable shape=[3,64,64] the model runs normally, this variable indicates what should be the size of the latent space that is generated with the diffusion model. Although, as said before, the model does not give good results if you are using the weights of text2image but the problem is not the shape, but the weights.
Hiya! Have you been lucky in solving this issue? :)