A prompting enhancement library for transformers-type text embedding systems
MIT License
519
stars
47
forks
source link
ValueError: `prompt_embeds` and `negative_prompt_embeds` must have the same shape when passed directly, but got: `prompt_embeds` torch.Size([1, 154, 2048]) != `negative_prompt_embeds` torch.Size([1, 77, 2048]). #75
Hey, thanks again for the framework. I have the following code working fine on my local computer, but it failed on replicate server(remote GPU).
ValueError:prompt_embedsandnegative_prompt_embedsmust have the same shape when passed directly, but got:prompt_embedstorch.Size([1, 154, 2048]) !=negative_prompt_embedstorch.Size([1, 77, 2048]).
What did I miss? Thanks!
self._background_image_pipe = StableDiffusionXLControlNetImg2ImgPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=[self.depth_control_net.controlnet],
torch_dtype=torch.float16,
variant="fp16",
cache_dir=transformers_utils.get_ml_model_catch_path()
).to(transformers_utils.get_device_type())
compel = Compel(tokenizer=[self._background_image_pipe.tokenizer, self._background_image_pipe.tokenizer_2],
text_encoder=[self._background_image_pipe.text_encoder, self._background_image_pipe.text_encoder_2],
returned_embeddings_type=ReturnedEmbeddingsType.PENULTIMATE_HIDDEN_STATES_NON_NORMALIZED,
requires_pooled=[False, True],
truncate_long_prompts=False)
positive_conditioning, positive_pooled = compel(lora_style.style_prompt(text_prompt))
if negative_prompt is None:
negative_prompt = ("out of frame, text, error, cropped, jpeg artifacts,nout of frame, extra fingers, "
"mutated hands, poorly drawn hands, poorly drawn face, blurry, bad anatomy, malformed "
"limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many "
"fingers, long neck, username, watermark, signature.")
negative_conditioning, negative_pooled = compel(negative_prompt)
images = self._background_image_pipe(
prompt_embeds=positive_conditioning,
pooled_prompt_embeds=positive_pooled,
negative_prompt_embeds=negative_conditioning,
negative_pooled_prompt_embeds=negative_pooled,
control_image=[depth_image],
image=reference_image,
generator=generator,
num_inference_steps=num_inference_steps, # steps between 15 and 30 work well for us
strength=denoising_strength, # make sure to use `strength` below 1.0. (SDXL has issues when strength is 1.0)
guidance_scale=guidance_scale, # how close to follow the prompt(aka. classifier-free guidance scale)
).images
Hey, thanks again for the framework. I have the following code working fine on my local computer, but it failed on replicate server(remote GPU).
ValueError:
prompt_embedsand
negative_prompt_embedsmust have the same shape when passed directly, but got:
prompt_embedstorch.Size([1, 154, 2048]) !=
negative_prompt_embedstorch.Size([1, 77, 2048]).
What did I miss? Thanks!