NVIDIA / Stable-Diffusion-WebUI-TensorRT

TensorRT Extension for Stable Diffusion Web UI
MIT License
1.92k stars 146 forks source link

TensorRT gives a Runtime Error when Positive prompt value is 150 and Negative prompt value is 75. #309

Open peteer01 opened 7 months ago

peteer01 commented 7 months ago

Automatic1111, txt2img generation, I am trying to use 150 text-length for the positive prompt, and 75 text-length for the negative prompt. It works successfully when both positive and negative text-length are 75, it fails if positive is 150 with the following error:

RuntimeError: The expanded size of the tensor (1) must match the existing size (2) at non-singleton dimension 0. Target sizes: [1, 4, 128, 128]. Tensor sizes: [2, 4, 128, 128]

I have set up multiple profiles trying to get this resolved with no success. I currently have the following profiles for Available TensorRT Engine Profiles for the checkpoint I am using:

Profile 0

  | Min | Opt | Max -- | -- | -- | -- Height | 1024 | 1024 | 1024 Width | 1024 | 1024 | 1024 Batch Size | 1 | 1 | 1 Text-length | 75 | 150 | 225

Profile 1

  | Min | Opt | Max -- | -- | -- | -- Height | 1024 | 1024 | 1024 Width | 1024 | 1024 | 1024 Batch Size | 1 | 1 | 1 Text-length | 150 | 150 | 150

Profile 2

  | Min | Opt | Max -- | -- | -- | -- Height | 1024 | 1024 | 1024 Width | 1024 | 1024 | 1024 Batch Size | 1 | 1 | 1 Text-length | 75 | 75 | 75

Profile 3

  | Min | Opt | Max -- | -- | -- | -- Height | 1024 | 1024 | 1024 Width | 1024 | 1024 | 1024 Batch Size | 1 | 1 | 1 Text-length | 225 | 225 | 225

If I reduce positive to 75 text-length, but make negative 150 text-length, it also fails.

If I add filler words so I have 150 text-length for the positive prompt and negative prompt then it works successfully.

My understanding is that the Profile 0 should support 150 positive and 75 negative, but this does not work in practice. Here is the full console output when trying to generate an image with this mix of text-length 150 for positive and 75 for negative:

  0%|                                                                                           | 0/20 [00:00<?, ?it/s][E] 3: [executionContext.cpp::nvinfer1::rt::ExecutionContext::validateInputBindings::2046] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::nvinfer1::rt::ExecutionContext::validateInputBindings::2046, condition: profileMinDims.d[i] <= dimensions.d[i]. Supplied binding dimension [1,4,128,128] for bindings[0] exceed min ~ max range at index 0, maximum dimension in profile is 2, minimum dimension in profile is 2, but supplied dimension is 1.
    )
[E] 3: [executionContext.cpp::nvinfer1::rt::ExecutionContext::validateInputBindings::2046] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::nvinfer1::rt::ExecutionContext::validateInputBindings::2046, condition: profileMinDims.d[i] <= dimensions.d[i]. Supplied binding dimension [1] for bindings[1] exceed min ~ max range at index 0, maximum dimension in profile is 2, minimum dimension in profile is 2, but supplied dimension is 1.
    )
[E] 3: [executionContext.cpp::nvinfer1::rt::ExecutionContext::validateInputBindings::2046] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::nvinfer1::rt::ExecutionContext::validateInputBindings::2046, condition: profileMinDims.d[i] <= dimensions.d[i]. Supplied binding dimension [1,154,2048] for bindings[2] exceed min ~ max range at index 0, maximum dimension in profile is 2, minimum dimension in profile is 2, but supplied dimension is 1.
    )
[E] 3: [executionContext.cpp::nvinfer1::rt::ExecutionContext::validateInputBindings::2046] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::nvinfer1::rt::ExecutionContext::validateInputBindings::2046, condition: profileMinDims.d[i] <= dimensions.d[i]. Supplied binding dimension [1,2816] for bindings[3] exceed min ~ max range at index 0, maximum dimension in profile is 2, minimum dimension in profile is 2, but supplied dimension is 1.
    )
  0%|                                                                                           | 0/20 [00:00<?, ?it/s]
*** Error completing request
*** Arguments: ('task(xjdn1sa1n2skykw)', <gradio.routes.Request object at 0x000001BC400FC6D0>, 'PTA_1\ntest words, sample words, positive prompt, generate image, test words, sample words, positive prompt, generate image, test words, sample words, positive prompt, generate image, test words, sample words, positive prompt, generate image, test words, sample words, positive prompt, generate image, test words, sample words, positive prompt, generate image, test words, sample words, positive prompt, generate image, \n <lora:PTA_1-26:1>', 'test words, sample words, positive prompt, generate image, test words, sample words, positive prompt, generate image, test words, sample words, positive prompt, generate image, test words, sample words, positive prompt, generate image, test words, sample words, positive prompt, generate image, test words, sample words, positive prompt, generate image, test words', [], 20, 'DPM++ 2M Karras', 1, 1, 5.5, 1024, 1024, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', '', '', [], 0, False, '', 0.8, 343205269, False, -1, 0, 0, 0, UiControlNetUnit(enabled=False, module='none', model='None', weight=1, image=None, resize_mode='Crop and Resize', low_vram=False, processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', inpaint_crop_input_image=False, hr_option='Both', save_detected_map=True, advanced_weighting=None), UiControlNetUnit(enabled=False, module='none', model='None', weight=1, image=None, resize_mode='Crop and Resize', low_vram=False, processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', inpaint_crop_input_image=False, hr_option='Both', save_detected_map=True, advanced_weighting=None), UiControlNetUnit(enabled=False, module='none', model='None', weight=1, image=None, resize_mode='Crop and Resize', low_vram=False, processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', inpaint_crop_input_image=False, hr_option='Both', save_detected_map=True, advanced_weighting=None), False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False, None, None, False, None, None, False, None, None, False, 50) {}
    Traceback (most recent call last):
      File "C:\Users\petee\stable-diffusion-webui\modules\call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "C:\Users\petee\stable-diffusion-webui\modules\call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "C:\Users\petee\stable-diffusion-webui\modules\txt2img.py", line 110, in txt2img
        processed = processing.process_images(p)
      File "C:\Users\petee\stable-diffusion-webui\extensions\sd-webui-prompt-history\lib_history\image_process_hijacker.py", line 21, in process_images
        res = original_function(p)
      File "C:\Users\petee\stable-diffusion-webui\modules\processing.py", line 785, in process_images
        res = process_images_inner(p)
      File "C:\Users\petee\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\batch_hijack.py", line 59, in processing_process_images_hijack
        return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
      File "C:\Users\petee\stable-diffusion-webui\modules\processing.py", line 921, in process_images_inner
        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "C:\Users\petee\stable-diffusion-webui\modules\processing.py", line 1257, in sample
        samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
      File "C:\Users\petee\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 234, in sample
        samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "C:\Users\petee\stable-diffusion-webui\modules\sd_samplers_common.py", line 261, in launch_sampling
        return func()
      File "C:\Users\petee\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 234, in <lambda>
        samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "C:\Users\petee\stable-diffusion-webui\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "C:\Users\petee\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\sampling.py", line 594, in sample_dpmpp_2m
        denoised = model(x, sigmas[i] * s_in, **extra_args)
      File "C:\Users\petee\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "C:\Users\petee\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Users\petee\stable-diffusion-webui\modules\sd_samplers_cfg_denoiser.py", line 256, in forward
        x_out[a:b] = self.inner_model(x_in[a:b], sigma_in[a:b], cond=make_condition_dict(c_crossattn, image_cond_in[a:b]))
    RuntimeError: The expanded size of the tensor (1) must match the existing size (2) at non-singleton dimension 0.  Target sizes: [1, 4, 128, 128].  Tensor sizes: [2, 4, 128, 128]

---

Is this a bug? How do I get TensorRT to process this mix of positive and negative prompt text-length?