Closed SAC020 closed 8 months ago
this is due to part of the model being in vram and part in system ram. what are your settings for model move/offload? medvram or lowvram?
this is due to part of the model being in vram and part in system ram. what are your settings for model move/offload? medvram or lowvram?
I use medvram. Nvidia 4080 with 16GB of VRAM
control module is not yet compatible with automatic offloading that is enabled with medvram.
control module is not yet compatible with automatic offloading that is enabled with medvram.
is there something I can do?
until its fully implemented, you can not run with medvram and attempt manual move options instead of offloading. it will still save some vram, just not that much.
control module is not yet compatible with automatic offloading that is enabled with medvram.
Does this apply to inpainting too? Because I seem to get the same error on inpainting as well
this should be addressed in dev branch now.
this should be addressed in dev branch now.
thank you
its possible (likely?) there are borderline scenarios where offloading still causes issues, but they need to be taken one-at-the-time, so please report any...
Got a problem using Canny XS
Same with Canny Mid XL
07:37:10-407893 DEBUG Control ControlNet model loading: id="Canny Mid XL" path="diffusers/controlnet-canny-sdxl-1.0-mid" 07:37:12-004216 DEBUG Control ControlNet model loaded: id="Canny Mid XL" path="diffusers/controlnet-canny-sdxl-1.0-mid" time=1.60 07:37:13-036848 DEBUG Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=1024x1024 at 0x22CE2759A50>] 07:37:13-660455 DEBUG Control ControlNet unit: i=1 process=Canny model=Canny Mid XL strength=1.0 guess=False start=0 end=1 07:37:13-667895 ERROR Control exception: It seems like you have activated sequential model offloading by calling
enable_sequential_cpu_offload
, but are now attempting to move the pipeline to GPU. This is not compatible with offloading. Please, move your pipeline.to('cpu')
or consider removing the move altogether if you use sequential offloading. 07:37:13-670375 ERROR Control: ValueError ╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮ │ C:\ai\automatic\modules\ui_control.py:54 in generate_click │ │ │ │ 53 │ │ try: │ │ ❱ 54 │ │ │ for results in control_run(units, helpers.input_source, helpers.input_init, helpers.input_mask, ac │ │ 55 │ │ │ │ progress.record_results(job_id, results) │ │ │ │ C:\ai\automatic\modules\control\run.py:198 in control_run │ │ │ │ 197 │ │ p.task_args['guess_mode'] = p.guess_mode │ │ ❱ 198 │ │ instance = controlnet.ControlNetPipeline(selected_models, shared.sd_model) │ │ 199 │ │ pipe = instance.pipeline │ │ │ │ C:\ai\automatic\modules\control\units\controlnet.py:196 in init │ │ │ │ 195 │ │ │ │ controlnet=controlnet, # can be a list │ │ ❱ 196 │ │ │ ).to(pipeline.device) │ │ 197 │ │ elif detect.is_sd15(pipeline): │ │ │ │ C:\ai\automatic\venv\lib\site-packages\diffusers\pipelines\pipeline_utils.py:855 in to │ │ │ │ 854 │ │ if pipeline_is_sequentially_offloaded and device and torch.device(device).type == "cuda": │ │ ❱ 855 │ │ │ raise ValueError( │ │ 856 │ │ │ │ "It seems like you have activated sequential model offloading by callingenable_sequential_c │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ ValueError: It seems like you have activated sequential model offloading by calling
enable_sequential_cpu_offload, but are now attempting to move the pipeline to GPU. This is not compatible with offloading. Please, move your pipeline
.to('cpu')` or consider removing the move altogether if you use sequential offloading.
pushed another update
Not sure if related to the same root cause
PS C:\ai\automatic> .\webui.bat --medvram --debug
Using VENV: C:\ai\automatic\venv
16:42:52-845340 INFO Starting SD.Next
16:42:52-848813 INFO Logger: file="C:\ai\automatic\sdnext.log" level=DEBUG size=65 mode=create
16:42:52-849805 INFO Python 3.10.11 on Windows
16:42:53-076344 INFO Version: app=sd.next updated=2024-03-03 hash=f7ea7620
url=https://github.com/vladmandic/automatic/tree/dev
16:42:53-807877 INFO Latest published version: 912237ecf7d5b3616a272f983f3f59cc405f64c3 2024-03-01T14:24:24Z
16:42:53-820774 INFO Platform: arch=AMD64 cpu=Intel64 Family 6 Model 165 Stepping 5, GenuineIntel system=Windows
release=Windows-10-10.0.22631-SP0 python=3.10.11
16:39:21-659246 DEBUG Control Processor loading: id="Canny" class=CannyDetector
16:39:21-671150 DEBUG Control Processor loaded: id="Canny" class=CannyDetector time=0.02
16:39:27-950343 DEBUG Control T2I-Adapter model loading: id="Canny XL" path="TencentARC/t2i-adapter-canny-sdxl-1.0"
16:39:29-125376 DEBUG Control T2I-Adapter loaded: id="Canny XL" path="TencentARC/t2i-adapter-canny-sdxl-1.0"
time=1.18
16:39:30-159044 DEBUG Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=1131x1131 at
0x1EF9B64B9D0>]
16:39:31-550636 DEBUG Control T2I-Adapter unit: i=1 process=Canny model=Canny XL strength=1.0 factor=1.0
16:39:31-816987 DEBUG Control T2I-Adapter pipeline: class=StableDiffusionXLAdapterPipeline time=0.26
16:39:33-646000 DEBUG Setting model: enable model CPU offload
16:39:35-535778 DEBUG Setting model: enable VAE slicing
16:39:35-549172 DEBUG Image resize: input=<PIL.Image.Image image mode=RGB size=1131x1131 at 0x1EF9B64B9D0> mode=1
target=1024x1024 upscaler=ESRGAN 4xNMKDSuperscale_4xNMKDSuperscale function=control_run
16:39:35-594308 DEBUG Control Processor: id="Canny" mode=L args={'low_threshold': 100, 'high_threshold': 200}
time=0.02
16:39:35-654376 WARNING Pipeline class change failed: type=DiffusersTaskType.IMAGE_2_IMAGE
pipeline=StableDiffusionXLAdapterPipeline AutoPipeline can't find a pipeline linked to
StableDiffusionXLAdapterPipeline for None
Loading model: C:\ai\automatic\models\Lora\cydohd-mixed-sdxl-v13-civitai.safetensors ━━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
Loading model: C:\ai\automatic\models\Lora\cydohd-mixed-sdxl-v10-civitai.safetensors ━━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
Loading model: C:\ai\automatic\models\Lora\sacbf-mixed-sdxl-civitai-v3.safetensors ━━━━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
Loading model: C:\ai\automatic\models\Lora\sacbf-sacdalle-sdxl-civitai-v2.safetensors ━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
16:39:40-912154 INFO LoRA apply: ['cydohd-mixed-sdxl-v13-civitai', 'cydohd-mixed-sdxl-v10-civitai',
'sacbf-mixed-sdxl-civitai-v3', 'sacbf-sacdalle-sdxl-civitai-v2'] patch=0.00 load=5.25
16:39:40-915131 INFO Base: class=StableDiffusionXLAdapterPipeline
16:39:42-257427 DEBUG Diffuser pipeline: StableDiffusionXLAdapterPipeline task=DiffusersTaskType.TEXT_2_IMAGE
set={'prompt_embeds': torch.Size([1, 77, 2048]), 'pooled_prompt_embeds': torch.Size([1, 1280]),
'negative_prompt_embeds': torch.Size([1, 77, 2048]), 'negative_pooled_prompt_embeds':
torch.Size([1, 1280]), 'guidance_scale': 1, 'generator': device(type='cuda'),
'num_inference_steps': 6, 'eta': 1.0, 'guidance_rescale': 0.7, 'denoising_end': None,
'output_type': 'latent', 'width': 1024, 'height': 1024, 'adapter_conditioning_scale': 1.0,
'image': [<PIL.Image.Image image mode=L size=1024x1024 at 0x1EF9B654F40>], 'parser': 'Full
parser'}
16:39:42-334307 DEBUG Sampler: sampler="DPM SDE" config={'num_train_timesteps': 1000, 'beta_start': 0.00085,
'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'prediction_type': 'epsilon',
'use_karras_sigmas': True}
16:39:42-952182 ERROR Processing: args={'prompt_embeds': tensor([[[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]]], device='cuda:0',
dtype=torch.float16), 'pooled_prompt_embeds': tensor([[-1.1396, 0.0978, 0.5444, ...,
-1.0469, -2.2773, -0.4849]],
device='cuda:0', dtype=torch.float16), 'negative_prompt_embeds': tensor([[[-1.8027,
-0.7866, 1.5137, ..., 0.0922, 0.3796, -0.2493],
[-0.0142, 0.1144, -0.0961, ..., 0.2178, 0.1851, -0.0518],
[-0.0049, -0.1482, -0.2539, ..., 0.1390, 0.0563, 0.2590],
...,
[-0.3384, 0.2327, -0.7563, ..., 0.4963, 0.3696, 1.2988],
[-0.3381, 0.2325, -0.7515, ..., 0.4409, 0.2812, 1.3398],
[-0.3186, 0.2390, -0.7334, ..., 0.5625, 0.3701, 1.4365]]],
device='cuda:0', dtype=torch.float16), 'negative_pooled_prompt_embeds':
tensor([[-0.3142, 0.1411, 0.8540, ..., -1.2842, -1.6533, 0.5464]],
device='cuda:0', dtype=torch.float16), 'guidance_scale': 1, 'generator':
[<torch._C.Generator object at 0x000001EF98E858B0>], 'callback_steps': 1, 'callback': <function
process_diffusers.<locals>.diffusers_callback_legacy at 0x000001EDA5F01120>,
'num_inference_steps': 6, 'eta': 1.0, 'guidance_rescale': 0.7, 'denoising_end': None,
'output_type': 'latent', 'width': 1024, 'height': 1024, 'adapter_conditioning_scale': 1.0,
'image': [<PIL.Image.Image image mode=L size=1024x1024 at 0x1EF9B654F40>]} Given groups=1,
weight of size [320, 768, 3, 3], expected input[1, 256, 64, 64] to have 768 channels, but got
256 channels instead
16:39:42-960066 ERROR Processing: RuntimeError
╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮
│ C:\ai\automatic\modules\processing_diffusers.py:417 in process_diffusers │
│ │
│ 416 │ │ sd_models.move_model(shared.sd_model, devices.device) │
│ ❱ 417 │ │ output = shared.sd_model(**base_args) # pylint: disable=not-callable │
│ 418 │ │ if isinstance(output, dict): │
│ │
│ C:\ai\automatic\venv\lib\site-packages\torch\utils\_contextlib.py:115 in decorate_context │
│ │
│ 114 │ │ with ctx_factory(): │
│ ❱ 115 │ │ │ return func(*args, **kwargs) │
│ 116 │
│ │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\pipelines\t2i_adapter\pipeline_stable_diffusion_xl_adapter.py:1137 │
│ │
│ 1136 │ │ else: │
│ ❱ 1137 │ │ │ adapter_state = self.adapter(adapter_input) │
│ 1138 │ │ │ for k, v in enumerate(adapter_state): │
│ │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\module.py:1511 in _wrapped_call_impl │
│ │
│ 1510 │ │ else: │
│ ❱ 1511 │ │ │ return self._call_impl(*args, **kwargs) │
│ 1512 │
│ │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\module.py:1520 in _call_impl │
│ │
│ 1519 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1520 │ │ │ return forward_call(*args, **kwargs) │
│ 1521 │
│ │
│ ... 5 frames hidden ... │
│ │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\module.py:1511 in _wrapped_call_impl │
│ │
│ 1510 │ │ else: │
│ ❱ 1511 │ │ │ return self._call_impl(*args, **kwargs) │
│ 1512 │
│ │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\module.py:1520 in _call_impl │
│ │
│ 1519 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1520 │ │ │ return forward_call(*args, **kwargs) │
│ 1521 │
│ │
│ C:\ai\automatic\extensions-builtin\Lora\networks.py:396 in network_Conv2d_forward │
│ │
│ 395 │ network_apply_weights(self) │
│ ❱ 396 │ return originals.Conv2d_forward(self, input) │
│ 397 │
│ │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\conv.py:460 in forward │
│ │
│ 459 │ def forward(self, input: Tensor) -> Tensor: │
│ ❱ 460 │ │ return self._conv_forward(input, self.weight, self.bias) │
│ 461 │
│ │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\conv.py:456 in _conv_forward │
│ │
│ 455 │ │ │ │ │ │ │ _pair(0), self.dilation, self.groups) │
│ ❱ 456 │ │ return F.conv2d(input, weight, bias, self.stride, │
│ 457 │ │ │ │ │ │ self.padding, self.dilation, self.groups) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Given groups=1, weight of size [320, 768, 3, 3], expected input[1, 256, 64, 64] to have 768 channels, but got 256 channels instead
16:39:43-446902 DEBUG Control restored pipeline: class=StableDiffusionXLPipeline
16:39:43-449878 INFO Processed: images=0 time=7.80 its=0.00 memory={'ram': {'used': 20.38, 'total': 63.92}, 'gpu':
{'used': 8.57, 'total': 15.99}, 'retries': 0, 'oom': 0}
16:39:43-508902 INFO Control: pipeline units=1 process=1 time=11.96 init=0.27 proc=3.80 ctrl=7.89 outputs=0
16:39:43-510887 DEBUG Control restored pipeline: class=StableDiffusionXLPipeline
no, that looks different.
no, that looks different.
ok, should I open a separate issue?
yes pls
This seems relevant to this thread ("Cannot generate a cpu tensor from a generator of type cuda.")
Control Canny => XS Canny
06:16:41-945643 INFO Version: app=sd.next updated=2024-03-07 hash=9e93a63e
url=https://github.com/vladmandic/automatic/tree/dev
06:16:42-457057 INFO Latest published version: bc4b633e8de3b9392595982e41673177dde1333d 2024-03-07T02:13:20Z
06:16:42-470450 INFO Platform: arch=AMD64 cpu=Intel64 Family 6 Model 165 Stepping 5, GenuineIntel system=Windows
release=Windows-10-10.0.22631-SP0 python=3.10.11
06:10:56-057073 DEBUG Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=862x862 at
0x15827F0D0F0>]
06:11:00-306309 DEBUG Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=862x862 at
0x158294E4AC0>]
06:11:00-868288 DEBUG Control ControlNet-XS unit: i=1 process=Canny model=Canny strength=1.0 guess=False start=0
end=1
06:11:01-073632 DEBUG Control ControlNet-XS pipeline: class=StableDiffusionXLControlNetXSPipeline time=0.20
06:11:02-982710 DEBUG Setting model: enable model CPU offload
06:11:06-126280 DEBUG Setting model: enable VAE slicing
06:11:06-140168 DEBUG Image resize: input=<PIL.Image.Image image mode=RGB size=862x862 at 0x158294E4AC0> mode=1
target=1024x1024 upscaler=ESRGAN 4xNMKDSuperscale_4xNMKDSuperscale function=control_run
06:11:06-142649 DEBUG Upscaler cached: type=ESRGAN model=models\ESRGAN\4xNMKDSuperscale_4xNMKDSuperscale.pth
Upscaling ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:02
06:11:09-038407 DEBUG Control Processor: id="Canny" mode=RGB args={'low_threshold': 100, 'high_threshold': 200}
time=0.03
Loading model: C:\ai\automatic\models\Lora\cydohd-mixed-sdxl-v10-civitai.safetensors ━━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
Loading model: C:\ai\automatic\models\Lora\sacbf-mixed-sdxl-civitai-v3.safetensors ━━━━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
Loading model: C:\ai\automatic\models\Lora\sacbf-sacdalle-sdxl-civitai-v2.safetensors ━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
06:11:13-411533 INFO LoRA apply: ['cydohd-mixed-sdxl-v10-civitai', 'sacbf-mixed-sdxl-civitai-v3',
'sacbf-sacdalle-sdxl-civitai-v2'] patch=0.00 load=4.27
06:11:13-415501 INFO Base: class=StableDiffusionXLControlNetXSPipeline
06:11:14-907684 DEBUG Diffuser pipeline: StableDiffusionXLControlNetXSPipeline task=DiffusersTaskType.TEXT_2_IMAGE
set={'prompt_embeds': torch.Size([1, 77, 2048]), 'pooled_prompt_embeds': torch.Size([1, 1280]),
'negative_prompt_embeds': torch.Size([1, 77, 2048]), 'negative_pooled_prompt_embeds':
torch.Size([1, 1280]), 'guidance_scale': 1, 'generator': device(type='cuda'),
'num_inference_steps': 6, 'eta': 1.0, 'output_type': 'latent', 'width': 1024, 'height': 1024,
'image': [<PIL.Image.Image image mode=RGB size=1024x1024 at 0x1583A26E110>], 'parser': 'Full
parser'}
06:11:14-989747 DEBUG Sampler: sampler="DPM SDE" config={'num_train_timesteps': 1000, 'beta_start': 0.00085,
'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'prediction_type': 'epsilon',
'use_karras_sigmas': True}
06:11:15-072083 ERROR Processing: args={'prompt_embeds': tensor([[[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]]], device='cuda:0',
dtype=torch.float16), 'pooled_prompt_embeds': tensor([[-0.8276, -0.1184, 0.1528, ...,
-0.6650, -1.8008, -0.8369]],
device='cuda:0', dtype=torch.float16), 'negative_prompt_embeds': tensor([[[-1.1641,
-0.4395, 1.1133, ..., 0.1495, 0.3523, -0.2423],
[-0.0290, -0.0774, -0.1122, ..., 0.2145, 0.1422, -0.0188],
[ 0.0709, -0.2002, -0.1653, ..., 0.1697, 0.0793, 0.1204],
...,
[-0.1682, -0.2725, -0.8081, ..., 0.2095, 0.1077, 1.2705],
[-0.1663, -0.2717, -0.8130, ..., 0.1311, 0.0339, 1.3057],
[-0.1633, -0.2549, -0.8037, ..., 0.2678, 0.1084, 1.4229]]],
device='cuda:0', dtype=torch.float16), 'negative_pooled_prompt_embeds':
tensor([[-0.1111, -0.0195, 0.8696, ..., -1.1689, -1.8076, 0.8428]],
device='cuda:0', dtype=torch.float16), 'guidance_scale': 1, 'generator':
[<torch._C.Generator object at 0x00000158299B5A50>], 'callback_steps': 1, 'callback': <function
process_diffusers.<locals>.diffusers_callback_legacy at 0x0000015611A820E0>,
'num_inference_steps': 6, 'eta': 1.0, 'output_type': 'latent', 'width': 1024, 'height': 1024,
'image': [<PIL.Image.Image image mode=RGB size=1024x1024 at 0x1583A26E110>]} Cannot generate a
cpu tensor from a generator of type cuda.
06:11:15-079523 ERROR Processing: ValueError
╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮
│ C:\ai\automatic\modules\processing_diffusers.py:415 in process_diffusers │
│ │
│ 414 │ │ sd_models.move_model(shared.sd_model, devices.device) │
│ ❱ 415 │ │ output = shared.sd_model(**base_args) # pylint: disable=not-callable │
│ 416 │ │ if isinstance(output, dict): │
│ │
│ C:\ai\automatic\venv\lib\site-packages\torch\utils\_contextlib.py:115 in decorate_context │
│ │
│ 114 │ │ with ctx_factory(): │
│ ❱ 115 │ │ │ return func(*args, **kwargs) │
│ 116 │
│ │
│ C:\ai\automatic\modules\control\units\xs_pipe.py:933 in __call__ │
│ │
│ 932 │ │ num_channels_latents = self.unet.config.in_channels │
│ ❱ 933 │ │ latents = self.prepare_latents( │
│ 934 │ │ │ batch_size * num_images_per_prompt, │
│ │
│ C:\ai\automatic\modules\control\units\xs_pipe.py:619 in prepare_latents │
│ │
│ 618 │ │ if latents is None: │
│ ❱ 619 │ │ │ latents = randn_tensor(shape, generator=generator, device=device, dtype=dtype) │
│ 620 │ │ else: │
│ │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\utils\torch_utils.py:66 in randn_tensor │
│ │
│ 65 │ │ elif gen_device_type != device.type and gen_device_type == "cuda": │
│ ❱ 66 │ │ │ raise ValueError(f"Cannot generate a {device} tensor from a generator of type {gen_device_type}.") │
│ 67 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Cannot generate a cpu tensor from a generator of type cuda.
06:11:15-322729 DEBUG Control restored pipeline: class=StableDiffusionXLPipeline
06:11:15-325706 INFO Processed: images=0 time=6.19 its=0.00 memory={'ram': {'used': 8.48, 'total': 63.92}, 'gpu':
{'used': 2.89, 'total': 15.99}, 'retries': 0, 'oom': 0}
06:11:15-378779 INFO Control: pipeline units=1 process=1 time=14.51 init=0.21 proc=8.06 ctrl=6.24 outputs=0
06:11:15-380268 DEBUG Control restored pipeline: class=StableDiffusionXLPipeline
And I get the same after running Controlnet Canny XL twice. First time it finishes the job, second time it fails with the same error without me changing anything other than removing loras from the prompt
Upon a third attempt, I also get this
06:28:04-324070 ERROR Model move: device=cuda It seems like you have activated sequential model offloading by calling
`enable_sequential_cpu_offload`, but are now attempting to move the pipeline to GPU. This is
not compatible with offloading. Please, move your pipeline `.to('cpu')` or consider removing
the move altogether if you use sequential offloading.
06:19:35-823273 DEBUG Control Processor loading: id="Canny" class=CannyDetector
06:19:35-824761 DEBUG Control Processor loaded: id="Canny" class=CannyDetector time=0.00
06:19:40-139866 DEBUG Control ControlNet model loading: id="Canny XL" path="diffusers/controlnet-canny-sdxl-1.0"
06:19:42-230859 DEBUG Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=862x862 at
0x146A519D1E0>]
06:19:42-828542 DEBUG Control ControlNet unit: i=1 process=Canny model=None strength=1.0 guess=False start=0 end=1
06:19:54-874491 DEBUG Control ControlNet model loaded: id="Canny XL" path="diffusers/controlnet-canny-sdxl-1.0"
time=14.73
06:19:54-896315 DEBUG Control ControlNet pipeline: class=StableDiffusionXLControlNetPipeline time=12.07
06:19:57-290557 DEBUG Setting model: enable model CPU offload
06:20:00-091457 DEBUG Server: alive=True jobs=1 requests=21 uptime=178 memory=14.57/63.92 backend=Backend.DIFFUSERS
state=idle
06:20:01-151617 DEBUG Setting model: enable VAE slicing
06:20:01-164016 DEBUG Image resize: input=<PIL.Image.Image image mode=RGB size=862x862 at 0x146A519D1E0> mode=1
target=1024x1024 upscaler=ESRGAN 4xNMKDSuperscale_4xNMKDSuperscale function=control_run
06:20:01-244370 INFO Upscaler loaded: type=ESRGAN model=models\ESRGAN\4xNMKDSuperscale_4xNMKDSuperscale.pth
Upscaling ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:02
06:20:04-381914 DEBUG Control Processor: id="Canny" mode=RGB args={'low_threshold': 100, 'high_threshold': 200}
time=0.03
Loading model: C:\ai\automatic\models\Lora\cydohd-mixed-sdxl-v10-civitai.safetensors ━━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
Loading model: C:\ai\automatic\models\Lora\sacbf-mixed-sdxl-civitai-v3.safetensors ━━━━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
Loading model: C:\ai\automatic\models\Lora\sacbf-sacdalle-sdxl-civitai-v2.safetensors ━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
06:20:07-809805 INFO LoRA apply: ['cydohd-mixed-sdxl-v10-civitai', 'sacbf-mixed-sdxl-civitai-v3',
'sacbf-sacdalle-sdxl-civitai-v2'] patch=0.00 load=3.32
06:20:07-819726 INFO Base: class=StableDiffusionXLControlNetPipeline
06:20:10-267062 DEBUG Diffuser pipeline: StableDiffusionXLControlNetPipeline task=DiffusersTaskType.TEXT_2_IMAGE
set={'prompt_embeds': torch.Size([1, 77, 2048]), 'pooled_prompt_embeds': torch.Size([1, 1280]),
'negative_prompt_embeds': torch.Size([1, 77, 2048]), 'negative_pooled_prompt_embeds':
torch.Size([1, 1280]), 'guidance_scale': 1, 'generator': device(type='cuda'),
'num_inference_steps': 6, 'eta': 1.0, 'output_type': 'latent', 'width': 1024, 'height': 1024,
'controlnet_conditioning_scale': 1.0, 'control_guidance_start': 0.0, 'control_guidance_end':
1.0, 'guess_mode': False, 'image': [<PIL.Image.Image image mode=RGB size=1024x1024 at
0x146851EBEB0>], 'parser': 'Full parser'}
06:20:10-369736 DEBUG Sampler: sampler="DPM SDE" config={'num_train_timesteps': 1000, 'beta_start': 0.00085,
'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'prediction_type': 'epsilon',
'use_karras_sigmas': True}
Progress 6.59s/it █████▊ 17% 1/6 00:06 00:32 Base06:20:17-285355 DEBUG VAE load: type=taesd model=models\TAESD\taesdxl_decoder.pth
Progress 1.44s/it ███████████████████████████████████ 100% 6/6 00:08 00:00 Base
06:20:26-149173 DEBUG Control restored pipeline: class=StableDiffusionXLPipeline
06:20:26-170006 INFO Processed: images=1 time=21.68 its=0.28 memory={'ram': {'used': 18.74, 'total': 63.92}, 'gpu':
{'used': 2.4, 'total': 15.99}, 'retries': 0, 'oom': 0}
06:20:26-999554 DEBUG Saving temp: image="C:\Users\sebas\AppData\Local\Temp\gradio\tmpm6rvbxie.png"
resolution=1024x1024 size=1636463
06:20:27-141908 INFO Control: pipeline units=1 process=1 time=43.39 init=12.07 proc=9.58 ctrl=21.74 outputs=1
06:20:27-144387 DEBUG Control restored pipeline: class=StableDiffusionXLPipeline
06:20:41-587360 DEBUG Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=862x862 at
0x1491C3EAEF0>]
06:20:42-160734 DEBUG Control ControlNet unit: i=1 process=Canny model=Canny XL strength=1.0 guess=False start=0
end=1
06:20:44-573006 DEBUG Control ControlNet pipeline: class=StableDiffusionXLControlNetPipeline time=2.41
06:20:44-642446 DEBUG Setting model: enable model CPU offload
06:20:47-733728 DEBUG Setting model: enable VAE slicing
06:20:47-746623 DEBUG Image resize: input=<PIL.Image.Image image mode=RGB size=862x862 at 0x1491C3EAEF0> mode=1
target=1024x1024 upscaler=ESRGAN 4xNMKDSuperscale_4xNMKDSuperscale function=control_run
06:20:47-749104 DEBUG Upscaler cached: type=ESRGAN model=models\ESRGAN\4xNMKDSuperscale_4xNMKDSuperscale.pth
Upscaling ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:02
06:20:49-983096 DEBUG Control Processor: id="Canny" mode=RGB args={'low_threshold': 100, 'high_threshold': 200}
time=0.03
06:20:50-092712 INFO Base: class=StableDiffusionXLControlNetPipeline
06:20:51-192517 DEBUG Diffuser pipeline: StableDiffusionXLControlNetPipeline task=DiffusersTaskType.TEXT_2_IMAGE
set={'prompt_embeds': torch.Size([1, 77, 2048]), 'pooled_prompt_embeds': torch.Size([1, 1280]),
'negative_prompt_embeds': torch.Size([1, 77, 2048]), 'negative_pooled_prompt_embeds':
torch.Size([1, 1280]), 'guidance_scale': 1, 'generator': device(type='cuda'),
'num_inference_steps': 6, 'eta': 1.0, 'output_type': 'latent', 'width': 1024, 'height': 1024,
'controlnet_conditioning_scale': 1.0, 'control_guidance_start': 0.0, 'control_guidance_end':
1.0, 'guess_mode': False, 'image': [<PIL.Image.Image image mode=RGB size=1024x1024 at
0x1491C3F17B0>], 'parser': 'Full parser'}
06:20:51-275348 DEBUG Sampler: sampler="DPM SDE" config={'num_train_timesteps': 1000, 'beta_start': 0.00085,
'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'prediction_type': 'epsilon',
'use_karras_sigmas': True}
06:20:51-320486 ERROR Processing: args={'prompt_embeds': tensor([[[-3.8086, -2.2227, 4.2500, ..., 0.1782, 0.4062,
-0.2651],
[-0.3682, 0.9868, -0.2249, ..., 0.6235, 0.0654, -0.1235],
[-0.9121, 0.3735, 0.5811, ..., 0.8818, -0.1223, -0.1892],
...,
[-0.1880, 0.3547, -0.4631, ..., 0.3096, 0.7734, 1.5244],
[-0.1674, 0.3381, -0.4482, ..., 0.2854, 0.6855, 1.5449],
[-0.1707, 0.3345, -0.3923, ..., 0.2959, 0.8662, 1.6211]]],
device='cuda:0', dtype=torch.float16), 'pooled_prompt_embeds': tensor([[-1.0850,
0.5771, 0.3389, ..., -1.0879, -1.3359, -0.7358]],
device='cuda:0', dtype=torch.float16), 'negative_prompt_embeds': tensor([[[-3.8086,
-2.2227, 4.2500, ..., 0.1782, 0.4062, -0.2651],
[-0.2634, -0.4390, -0.5625, ..., 0.4780, -0.5508, 0.8066],
[-0.3071, -0.4297, -0.5918, ..., -0.2783, 0.1602, -0.1042],
...,
[-0.1926, -0.2605, -0.8789, ..., 0.3950, 0.3237, 0.9126],
[-0.2051, -0.2615, -0.8726, ..., 0.3269, 0.2180, 0.9473],
[-0.2271, -0.1841, -0.8457, ..., 0.3677, 0.3582, 0.9487]]],
device='cuda:0', dtype=torch.float16), 'negative_pooled_prompt_embeds':
tensor([[-0.4524, 0.8706, -0.2712, ..., -0.8403, -0.7446, 0.7886]],
device='cuda:0', dtype=torch.float16), 'guidance_scale': 1, 'generator':
[<torch._C.Generator object at 0x000001491C408530>], 'callback_on_step_end': <function
process_diffusers.<locals>.diffusers_callback at 0x000001491C3F8B80>,
'callback_on_step_end_tensor_inputs': ['latents', 'prompt_embeds', 'negative_prompt_embeds'],
'num_inference_steps': 6, 'eta': 1.0, 'output_type': 'latent', 'width': 1024, 'height': 1024,
'controlnet_conditioning_scale': 1.0, 'control_guidance_start': 0.0, 'control_guidance_end':
1.0, 'guess_mode': False, 'image': [<PIL.Image.Image image mode=RGB size=1024x1024 at
0x1491C3F17B0>]} Cannot generate a cpu tensor from a generator of type cuda.
06:20:51-330405 ERROR Processing: ValueError
╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮
│ C:\ai\automatic\modules\processing_diffusers.py:415 in process_diffusers │
│ │
│ 414 │ │ sd_models.move_model(shared.sd_model, devices.device) │
│ ❱ 415 │ │ output = shared.sd_model(**base_args) # pylint: disable=not-callable │
│ 416 │ │ if isinstance(output, dict): │
│ │
│ C:\ai\automatic\venv\lib\site-packages\torch\utils\_contextlib.py:115 in decorate_context │
│ │
│ 114 │ │ with ctx_factory(): │
│ ❱ 115 │ │ │ return func(*args, **kwargs) │
│ 116 │
│ │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\pipelines\controlnet\pipeline_controlnet_sd_xl.py:1262 in __call__ │
│ │
│ 1261 │ │ num_channels_latents = self.unet.config.in_channels │
│ ❱ 1262 │ │ latents = self.prepare_latents( │
│ 1263 │ │ │ batch_size * num_images_per_prompt, │
│ │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\pipelines\controlnet\pipeline_controlnet_sd_xl.py:816 in prepare_la │
│ │
│ 815 │ │ if latents is None: │
│ ❱ 816 │ │ │ latents = randn_tensor(shape, generator=generator, device=device, dtype=dtype) │
│ 817 │ │ else: │
│ │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\utils\torch_utils.py:66 in randn_tensor │
│ │
│ 65 │ │ elif gen_device_type != device.type and gen_device_type == "cuda": │
│ ❱ 66 │ │ │ raise ValueError(f"Cannot generate a {device} tensor from a generator of type {gen_device_type}.") │
│ 67 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Cannot generate a cpu tensor from a generator of type cuda.
06:20:51-860136 DEBUG Control restored pipeline: class=StableDiffusionXLPipeline
06:20:51-861624 INFO High memory utilization: GPU=81% RAM=30% {'ram': {'used': 19.42, 'total': 63.92}, 'gpu':
{'used': 12.91, 'total': 15.99}, 'retries': 0, 'oom': 0}
06:20:52-181570 DEBUG GC: collected=218 device=cuda {'ram': {'used': 19.4, 'total': 63.92}, 'gpu': {'used': 3.07,
'total': 15.99}, 'retries': 0, 'oom': 0} time=0.32
06:20:52-184051 INFO Processed: images=0 time=2.09 its=0.00 memory={'ram': {'used': 19.4, 'total': 63.92}, 'gpu':
{'used': 3.07, 'total': 15.99}, 'retries': 0, 'oom': 0}
06:20:52-221271 INFO Control: pipeline units=1 process=1 time=10.06 init=2.41 proc=5.51 ctrl=2.14 outputs=0
06:20:52-223752 DEBUG Control restored pipeline: class=StableDiffusionXLPipeline
06:28:00-585456 DEBUG Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=540x540 at
0x1488A1D3B20>]
06:28:02-994759 DEBUG Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=540x540 at
0x1488A1D1FF0>]
06:28:03-527493 DEBUG Control ControlNet unit: i=1 process=Canny model=Canny XL strength=1.0 guess=False start=0
end=1
06:28:04-319608 DEBUG Control ControlNet pipeline: class=StableDiffusionXLControlNetPipeline time=0.79
06:28:04-324070 ERROR Model move: device=cuda It seems like you have activated sequential model offloading by calling
`enable_sequential_cpu_offload`, but are now attempting to move the pipeline to GPU. This is
not compatible with offloading. Please, move your pipeline `.to('cpu')` or consider removing
the move altogether if you use sequential offloading.
06:28:04-355389 DEBUG Setting model: enable model CPU offload
06:28:04-420861 DEBUG Setting model: enable VAE slicing
06:28:04-433260 DEBUG Image resize: input=<PIL.Image.Image image mode=RGB size=540x540 at 0x1488A1D1FF0> mode=1
target=1024x1024 upscaler=ESRGAN 4xNMKDSuperscale_4xNMKDSuperscale function=control_run
06:28:04-435561 DEBUG Upscaler cached: type=ESRGAN model=models\ESRGAN\4xNMKDSuperscale_4xNMKDSuperscale.pth
Upscaling ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:01
06:28:05-795506 DEBUG Control Processor: id="Canny" mode=RGB args={'low_threshold': 100, 'high_threshold': 200}
time=0.03
06:28:05-871394 INFO Base: class=StableDiffusionXLControlNetPipeline
06:28:06-866215 DEBUG Diffuser pipeline: StableDiffusionXLControlNetPipeline task=DiffusersTaskType.TEXT_2_IMAGE
set={'prompt_embeds': torch.Size([1, 77, 2048]), 'pooled_prompt_embeds': torch.Size([1, 1280]),
'negative_prompt_embeds': torch.Size([1, 77, 2048]), 'negative_pooled_prompt_embeds':
torch.Size([1, 1280]), 'guidance_scale': 1, 'generator': device(type='cuda'),
'num_inference_steps': 6, 'eta': 1.0, 'output_type': 'latent', 'width': 1024, 'height': 1024,
'controlnet_conditioning_scale': 1.0, 'control_guidance_start': 0.0, 'control_guidance_end':
1.0, 'guess_mode': False, 'image': [<PIL.Image.Image image mode=RGB size=1024x1024 at
0x1488A1D2F50>], 'parser': 'Full parser'}
06:28:06-941279 DEBUG Sampler: sampler="DPM SDE" config={'num_train_timesteps': 1000, 'beta_start': 0.00085,
'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'prediction_type': 'epsilon',
'use_karras_sigmas': True}
06:28:06-986416 ERROR Processing: args={'prompt_embeds': tensor([[[-3.8086, -2.2227, 4.2500, ..., 0.1782, 0.4062,
-0.2651],
[-0.3682, 0.9868, -0.2249, ..., 0.6235, 0.0654, -0.1235],
[-0.9121, 0.3735, 0.5811, ..., 0.8818, -0.1223, -0.1892],
...,
[-0.1880, 0.3547, -0.4631, ..., 0.3096, 0.7734, 1.5244],
[-0.1674, 0.3381, -0.4482, ..., 0.2854, 0.6855, 1.5449],
[-0.1707, 0.3345, -0.3923, ..., 0.2959, 0.8662, 1.6211]]],
device='cuda:0', dtype=torch.float16), 'pooled_prompt_embeds': tensor([[-1.0850,
0.5771, 0.3389, ..., -1.0879, -1.3359, -0.7358]],
device='cuda:0', dtype=torch.float16), 'negative_prompt_embeds': tensor([[[-3.8086,
-2.2227, 4.2500, ..., 0.1782, 0.4062, -0.2651],
[-0.2634, -0.4390, -0.5625, ..., 0.4780, -0.5508, 0.8066],
[-0.3071, -0.4297, -0.5918, ..., -0.2783, 0.1602, -0.1042],
...,
[-0.1926, -0.2605, -0.8789, ..., 0.3950, 0.3237, 0.9126],
[-0.2051, -0.2615, -0.8726, ..., 0.3269, 0.2180, 0.9473],
[-0.2271, -0.1841, -0.8457, ..., 0.3677, 0.3582, 0.9487]]],
device='cuda:0', dtype=torch.float16), 'negative_pooled_prompt_embeds':
tensor([[-0.4524, 0.8706, -0.2712, ..., -0.8403, -0.7446, 0.7886]],
device='cuda:0', dtype=torch.float16), 'guidance_scale': 1, 'generator':
[<torch._C.Generator object at 0x000001491C475E70>], 'callback_on_step_end': <function
process_diffusers.<locals>.diffusers_callback at 0x000001491C3DECB0>,
'callback_on_step_end_tensor_inputs': ['latents', 'prompt_embeds', 'negative_prompt_embeds'],
'num_inference_steps': 6, 'eta': 1.0, 'output_type': 'latent', 'width': 1024, 'height': 1024,
'controlnet_conditioning_scale': 1.0, 'control_guidance_start': 0.0, 'control_guidance_end':
1.0, 'guess_mode': False, 'image': [<PIL.Image.Image image mode=RGB size=1024x1024 at
0x1488A1D2F50>]} Cannot generate a cpu tensor from a generator of type cuda.
06:28:06-992864 ERROR Processing: ValueError
╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮
│ C:\ai\automatic\modules\processing_diffusers.py:415 in process_diffusers │
│ │
│ 414 │ │ sd_models.move_model(shared.sd_model, devices.device) │
│ ❱ 415 │ │ output = shared.sd_model(**base_args) # pylint: disable=not-callable │
│ 416 │ │ if isinstance(output, dict): │
│ │
│ C:\ai\automatic\venv\lib\site-packages\torch\utils\_contextlib.py:115 in decorate_context │
│ │
│ 114 │ │ with ctx_factory(): │
│ ❱ 115 │ │ │ return func(*args, **kwargs) │
│ 116 │
│ │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\pipelines\controlnet\pipeline_controlnet_sd_xl.py:1262 in __call__ │
│ │
│ 1261 │ │ num_channels_latents = self.unet.config.in_channels │
│ ❱ 1262 │ │ latents = self.prepare_latents( │
│ 1263 │ │ │ batch_size * num_images_per_prompt, │
│ │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\pipelines\controlnet\pipeline_controlnet_sd_xl.py:816 in prepare_la │
│ │
│ 815 │ │ if latents is None: │
│ ❱ 816 │ │ │ latents = randn_tensor(shape, generator=generator, device=device, dtype=dtype) │
│ 817 │ │ else: │
│ │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\utils\torch_utils.py:66 in randn_tensor │
│ │
│ 65 │ │ elif gen_device_type != device.type and gen_device_type == "cuda": │
│ ❱ 66 │ │ │ raise ValueError(f"Cannot generate a {device} tensor from a generator of type {gen_device_type}.") │
│ 67 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Cannot generate a cpu tensor from a generator of type cuda.
06:28:07-221520 DEBUG Control restored pipeline: class=StableDiffusionXLPipeline
06:28:07-223504 INFO Processed: images=0 time=1.36 its=0.00 memory={'ram': {'used': 18.73, 'total': 63.92}, 'gpu':
{'used': 2.87, 'total': 15.99}, 'retries': 0, 'oom': 0}
06:28:07-257729 INFO Control: pipeline units=1 process=1 time=3.73 init=0.79 proc=1.54 ctrl=1.40 outputs=0
06:28:07-259712 DEBUG Control restored pipeline: class=StableDiffusionXLPipeline
Are you on latest dev or master?
Are you on latest dev or master?
I am on dev, it was the latest version as of ~12h ago
06:55:17-321287 INFO Version: app=sd.next updated=2024-03-07 hash=9e93a63e
url=https://github.com/vladmandic/automatic/tree/dev
made few more improvements just now, still ways to go. if/when problem occurs now it can typically be worked around by unloading controlnet and reloading it.
Issue Description
Tried to use canny control, image generation fails with
Version Platform Description
PS C:\ai\automatic> .\webui.bat --debug --medvram --backend diffusers Using VENV: C:\ai\automatic\venv 08:23:25-128721 INFO Starting SD.Next 08:23:25-131713 INFO Logger: file="C:\ai\automatic\sdnext.log" level=DEBUG size=65 mode=create 08:23:25-133708 INFO Python 3.10.11 on Windows 08:23:25-321193 INFO Version: app=sd.next updated=2024-02-13 hash=635c0715 url=https://github.com/vladmandic/automatic/tree/dev 08:23:26-747991 INFO Latest published version: 3c952675fefd2c94b817940ffbd4cd94fd5876c9 2024-02-10T10:42:56Z 08:23:26-760930 INFO Platform: arch=AMD64 cpu=Intel64 Family 6 Model 165 Stepping 5, GenuineIntel system=Windows release=Windows-10-10.0.22631-SP0 python=3.10.11
Relevant log output
Backend
Diffusers
Branch
Dev
Model
SD-XL
Acknowledgements