openvinotoolkit / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
245 stars 38 forks source link

Prompt Size not Equal to Each Other #95

Open Truebob opened 5 months ago

Truebob commented 5 months ago

Is there an existing issue for this?

What happened?

Positive Prompt maximum token size does not equal negative prompt maximum token size or denominator size. There is no max, but its like the fraction of prompts and how its arranged.

Steps to reproduce the problem

Basically use automatic1111 to go over 75 tokens for positive prompt and leave negative prompt blank. So I get 100/150 for positive and 0/75 for negative prompt. Basically the prompts must equal. It is not token size of maximum for each, but rather they must equal each other. Positve prompt denominator must equal negative prompt denominator. Eg: 1/75 =33/75. 100/150 = 81/150. Those work, but not when the denominators not equal. Eg: 1/75 100/150. 100/150 1/75.

What should have happened?

It should work no matter what size of prompt for positive or negative.

Sysinfo

ValueError: prompt_embeds and negative_prompt_embeds must have the same shape when passed directly, but got: prompt_embeds torch.Size([1, 154, 768]) != negative_prompt_embeds torch.Size([1, 77, 768]).

What browsers do you use to access the UI ?

No response

Console logs

*** Error completing request
*** Arguments: ('task(1rvficmfnyi143s)', '(RAW photo:1.2),(photorealistic:1.4),(masterpiece:1.3),(best quality:1.4),ultra high res, HDR,8k resolution,\ndreamlike, check commentary, commentary request, scenery,((no text)),\n1girl,  (cleavage:1.5), (large breasts:1.5), (pubic hair:1.5),(lifting shirt), (detailed laced underpants:1.4), (full body), (looking down:1.5), (close up), look at the viewer, naughty face, (touching self hair:1.3), (tattoo:1.3), topless, arm, (close up:1.5), (focus on breasts),\n(detailed eyes),(detailed facial features), (detailed clothes features), (breast blush)', '', [], 20, 'DPM++ 2M Karras', 1, 1, 7, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', '', '', [], <gradio.routes.Request object at 0x000002260E765600>, 1, False, '', 0.8, 2668558297, False, -1, 0, 0, 0, False, False, {'ad_model': 'face_yolov8n.pt', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M Karras', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, {'ad_model': 'None', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M Karras', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, UiControlNetUnit(enabled=False, module='none', model='None', weight=1, image=None, resize_mode='Crop and Resize', low_vram=False, processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', inpaint_crop_input_image=False, hr_option='Both', save_detected_map=True, advanced_weighting=None), UiControlNetUnit(enabled=False, module='none', model='None', weight=1, image=None, resize_mode='Crop and Resize', low_vram=False, processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', inpaint_crop_input_image=False, hr_option='Both', save_detected_map=True, advanced_weighting=None), UiControlNetUnit(enabled=False, module='none', model='None', weight=1, image=None, resize_mode='Crop and Resize', low_vram=False, processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', inpaint_crop_input_image=False, hr_option='Both', save_detected_map=True, advanced_weighting=None), 'None', 'None', 'GPU', True, 'Euler a', True, False, 'None', 0.8, False, False, 'positive', 'comma', 0, False, False, '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, False, None, None, False, None, None, False, None, None, False, 50) {}
    Traceback (most recent call last):
      File "D:\Mark10\stable-diffusion-webui\modules\call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "D:\Mark10\stable-diffusion-webui\modules\call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "D:\Mark10\stable-diffusion-webui\modules\txt2img.py", line 52, in txt2img
        processed = modules.scripts.scripts_txt2img.run(p, *args)
      File "D:\Mark10\stable-diffusion-webui\modules\scripts.py", line 601, in run
        processed = script.run(p, *script_args)
      File "D:\Mark10\stable-diffusion-webui\scripts\openvino_accelerate.py", line 1228, in run
        processed = process_images_openvino(p, model_config, vae_ckpt, p.sampler_name, enable_caching, openvino_device, mode, is_xl_ckpt, refiner_ckpt, refiner_frac)
      File "D:\Mark10\stable-diffusion-webui\scripts\openvino_accelerate.py", line 979, in process_images_openvino
        output = shared.sd_diffusers_model(
      File "D:\Mark10\stable-diffusion-webui\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "D:\Mark10\stable-diffusion-webui\venv\lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py", line 754, in __call__
        self.check_inputs(
      File "D:\Mark10\stable-diffusion-webui\venv\lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py", line 530, in check_inputs
        raise ValueError(
    ValueError: `prompt_embeds` and `negative_prompt_embeds` must have the same shape when passed directly, but got: `prompt_embeds` torch.Size([1, 154, 768]) != `negative_prompt_embeds` torch.Size([1, 77, 768]).

Additional information

Please make an easy way to solve this in stable diffusion open vino toolkit. I am a newbie at editing python codes for open vino so I may struggle. Having an easy replaced script in the stable diffusion files or a minor edit that works on all models and loras, etc. Thanks if reply.

sebaxakerhtc commented 5 months ago

same here

cavusmustafa commented 5 months ago

Hello, thank you for reporting the issue. Could you also provide an exact reproducer with the prompts used to generate this error?

mysfitt commented 5 months ago

@cavusmustafa I am also getting the exact same error. I'll provide the prompts and console output inline here and would be happy to assist with debugging. The bug occurs when there are an unequal length of 75 token chunks between the positive and negative prompts. In my provided example, the positive prompt falls into the first 75 token chunk and the negative extends into the second chunk (150 tokens). Note that if the positive prompt also extends into the second chunk and each are logically 150 tokens, the script will work. Any time they are unequal, it will fail.

Positive Prompt:

man, cinematic, hyperrealistic, skin detailed, photo, hdr, cinematography, raining

Negative Prompt:

deformed, bad anatomy, disfigured, poorly drawn face, mutation, mutated, extra limb, ugly, disgusting, poorly drawn hands, missing limb, floating limbs, disconnected limbs, malformed hands, blurry, (mutated hands and fingers), watermark, watermarked, oversaturated, censored, distorted hands, amputation, missing hands, obese, doubled face, double hands

Steps to reproduce the problem

Use the prompts supplied above, or any other prompts that extend the token length past the initial 75 token limit, into a second chunk for either the negative or positive prompt. The issue is very easy to reproduce if one of your prompts is long and the other is short.

Full console output, including prompts

OpenVINO Script:  created model from config : /home/joss/Software/AI/automatic1111/stable-diffusion-webui/configs/v1-inference.yaml
*** Error completing request
*** Arguments: ('task(09rv9j7ci6ly1ls)', 'man, cinematic, hyperrealistic, skin detailed, photo, hdr, cinematography, raining', 'deformed, bad anatomy, disfigured, poorly drawn face, mutation, mutated, extra limb, ugly, disgusting, poorly drawn hands, missing limb, floating limbs, disconnected limbs, malformed hands, blurry, (mutated hands and fingers), watermark, watermarked, oversaturated, censored, distorted hands, amputation, missing hands, obese, doubled face, double hands', [], 35, 'DDIM', 1, 1, 7, 512, 904, False, 0.7, 2.1, 'R-ESRGAN 4x+ Anime6B', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', '', '', [], <gradio.routes.Request object at 0x7f3c5185c850>, 1, False, '', 0.8, -1, False, -1, 0, 0, 0, False, False, {'ad_model': 'Eyeful_v1.pt', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio':0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False,'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M Karras', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end':1, 'is_api': ()}, {'ad_model': 'None', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M Karras', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, False,False, 'Matrix', 'Columns', 'Mask', 'Prompt', '1,1', '0.2', False, False, False, 'Attention', [False], '0', '0', '0.4', None, '0', '0', False, 'None', 'None', 'GPU', True, 'DPM++ 2M Karras', True, False, 'None', 0.8, False, False, 'positive', 'comma', 0, False, False, '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, False, [], 30, '', 4, [], 1, '', '', '', '', 'Positive', 0, ', ', 'Generate and always save', 32) {}
    Traceback (most recent call last):
      File "/home/joss/Software/AI/automatic1111/stable-diffusion-webui/modules/call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "/home/joss/Software/AI/automatic1111/stable-diffusion-webui/modules/call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "/home/joss/Software/AI/automatic1111/stable-diffusion-webui/modules/txt2img.py", line 52, in txt2img
        processed = modules.scripts.scripts_txt2img.run(p, *args)
      File "/home/joss/Software/AI/automatic1111/stable-diffusion-webui/modules/scripts.py", line 601, in run
        processed = script.run(p, *script_args)
      File "/home/joss/Software/AI/automatic1111/stable-diffusion-webui/scripts/openvino_accelerate.py", line 1228, in run
        processed = process_images_openvino(p, model_config, vae_ckpt, p.sampler_name, enable_caching, openvino_device, mode, is_xl_ckpt, refiner_ckpt, refiner_frac)
      File "/home/joss/Software/AI/automatic1111/stable-diffusion-webui/scripts/openvino_accelerate.py", line 979, in process_images_openvino
        output = shared.sd_diffusers_model(
      File "/home/joss/Software/AI/automatic1111/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "/home/joss/Software/AI/automatic1111/stable-diffusion-webui/venv/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py",line 754, in __call__
        self.check_inputs(
      File "/home/joss/Software/AI/automatic1111/stable-diffusion-webui/venv/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py",line 530, in check_inputs
        raise ValueError(
    ValueError: `prompt_embeds` and `negative_prompt_embeds` must have the same shape when passed directly, but got: `prompt_embeds` torch.Size([1, 77, 768]) != `negative_prompt_embeds` torch.Size([1, 154, 768]).
AndrewRainfall commented 4 months ago

It's not a fix, but it works for me if I remove all negative embeddings (like "badhandv4", "CyberRealistic_Negative-neg", etc.). It also helps to remove all BREAK, make negative prompt shorter.

But it doesn't always work.

ymh1028 commented 4 months ago

same proplem,any Negative Prompt input will report this error

sebaxakerhtc commented 4 months ago

same proplem,any Negative Prompt input will report this error

Not any - just no equal. Let's say you have 30 tokens at positive prompt - try to have the same tokens for negative

luocaodan commented 3 months ago

met the same issue

chinmoy-gavini commented 2 weeks ago

Working on a fix for this, will update on this thread