vladmandic / automatic

SD.Next: Advanced Implementation of Stable Diffusion and other Diffusion-based generative image models
https://github.com/vladmandic/automatic
GNU Affero General Public License v3.0
5.68k stars 421 forks source link

[Feature]: Make control module compatible with offloading #2859

Closed SAC020 closed 8 months ago

SAC020 commented 8 months ago

Issue Description

Tried to use canny control, image generation fails with

Input type (torch.FloatTensor) and weight
                         type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight
                         is a dense tensor

image

Version Platform Description

PS C:\ai\automatic> .\webui.bat --debug --medvram --backend diffusers Using VENV: C:\ai\automatic\venv 08:23:25-128721 INFO Starting SD.Next 08:23:25-131713 INFO Logger: file="C:\ai\automatic\sdnext.log" level=DEBUG size=65 mode=create 08:23:25-133708 INFO Python 3.10.11 on Windows 08:23:25-321193 INFO Version: app=sd.next updated=2024-02-13 hash=635c0715 url=https://github.com/vladmandic/automatic/tree/dev 08:23:26-747991 INFO Latest published version: 3c952675fefd2c94b817940ffbd4cd94fd5876c9 2024-02-10T10:42:56Z 08:23:26-760930 INFO Platform: arch=AMD64 cpu=Intel64 Family 6 Model 165 Stepping 5, GenuineIntel system=Windows release=Windows-10-10.0.22631-SP0 python=3.10.11

Relevant log output

Terminate batch job (Y/N)? y
PS C:\ai\automatic> .\webui.bat --debug --medvram --backend diffusers
Using VENV: C:\ai\automatic\venv
08:23:25-128721 INFO     Starting SD.Next
08:23:25-131713 INFO     Logger: file="C:\ai\automatic\sdnext.log" level=DEBUG size=65 mode=create
08:23:25-133708 INFO     Python 3.10.11 on Windows
08:23:25-321193 INFO     Version: app=sd.next updated=2024-02-13 hash=635c0715
                         url=https://github.com/vladmandic/automatic/tree/dev
08:23:26-747991 INFO     Latest published version: 3c952675fefd2c94b817940ffbd4cd94fd5876c9 2024-02-10T10:42:56Z
08:23:26-760930 INFO     Platform: arch=AMD64 cpu=Intel64 Family 6 Model 165 Stepping 5, GenuineIntel system=Windows
                         release=Windows-10-10.0.22631-SP0 python=3.10.11
08:23:26-763345 DEBUG    Setting environment tuning
08:23:26-764346 DEBUG    HF cache folder: C:\Users\sebas\.cache\huggingface\hub
08:23:26-765342 DEBUG    Torch overrides: cuda=False rocm=False ipex=False diml=False openvino=False
08:23:26-766340 DEBUG    Torch allowed: cuda=True rocm=True ipex=True diml=True openvino=True
08:23:26-768335 INFO     nVidia CUDA toolkit detected: nvidia-smi present
08:23:26-871471 DEBUG    Repository update time: Wed Feb 14 04:24:19 2024
08:23:26-872502 INFO     Startup: standard
08:23:26-873469 INFO     Verifying requirements
08:23:26-887431 INFO     Verifying packages
08:23:26-889426 INFO     Verifying submodules
08:23:30-190024 DEBUG    Submodule: extensions-builtin/sd-extension-chainner / main
08:23:30-271766 DEBUG    Submodule: extensions-builtin/sd-extension-system-info / main
08:23:30-347047 DEBUG    Submodule: extensions-builtin/sd-webui-agent-scheduler / main
08:23:30-426726 DEBUG    Submodule: extensions-builtin/sd-webui-controlnet / main
08:23:30-545972 DEBUG    Submodule: extensions-builtin/stable-diffusion-webui-images-browser / main
08:23:30-626021 DEBUG    Submodule: extensions-builtin/stable-diffusion-webui-rembg / master
08:23:30-704351 DEBUG    Submodule: modules/k-diffusion / master
08:23:30-783723 DEBUG    Submodule: wiki / master
08:23:30-833099 DEBUG    Register paths
08:23:31-085425 DEBUG    Installed packages: 246
08:23:31-086422 DEBUG    Extensions all: ['Lora', 'sd-extension-chainner', 'sd-extension-system-info',
                         'sd-webui-agent-scheduler', 'sd-webui-controlnet', 'stable-diffusion-webui-images-browser',
                         'stable-diffusion-webui-rembg']
08:23:31-590274 DEBUG    Running extension installer:
                         C:\ai\automatic\extensions-builtin\sd-extension-system-info\install.py
08:23:32-251645 DEBUG    Running extension installer:
                         C:\ai\automatic\extensions-builtin\sd-webui-agent-scheduler\install.py
08:23:32-910725 DEBUG    Running extension installer: C:\ai\automatic\extensions-builtin\sd-webui-controlnet\install.py
08:23:33-572258 DEBUG    Running extension installer:
                         C:\ai\automatic\extensions-builtin\stable-diffusion-webui-images-browser\install.py
08:23:34-230587 DEBUG    Running extension installer:
                         C:\ai\automatic\extensions-builtin\stable-diffusion-webui-rembg\install.py
08:23:34-910179 DEBUG    Extensions all: []
08:23:34-911176 INFO     Extensions enabled: ['Lora', 'sd-extension-chainner', 'sd-extension-system-info',
                         'sd-webui-agent-scheduler', 'sd-webui-controlnet', 'stable-diffusion-webui-images-browser',
                         'stable-diffusion-webui-rembg']
08:23:34-913171 INFO     Verifying requirements
08:23:34-927134 DEBUG    Setup complete without errors: 1707891815
08:23:34-934115 INFO     Extension preload: {'extensions-builtin': 0.0, 'extensions': 0.0}
08:23:34-936109 DEBUG    Starting module: <module 'webui' from 'C:\\ai\\automatic\\webui.py'>
08:23:34-938103 INFO     Command line args: ['--debug', '--medvram', '--backend', 'diffusers'] medvram=True
                         backend=diffusers debug=True
08:23:34-939102 DEBUG    Env flags: []
08:23:39-968332 DEBUG    Package not found: olive-ai
08:23:41-194322 INFO     Load packages: {'torch': '2.2.0+cu121', 'diffusers': '0.26.3', 'gradio': '3.43.2'}
08:23:42-115866 DEBUG    Read: file="config.json" json=28 bytes=1232 time=0.000
08:23:42-119855 INFO     Engine: backend=Backend.DIFFUSERS compute=cuda device=cuda attention="Scaled-Dot-Product"
                         mode=no_grad
08:23:42-186677 INFO     Device: device=NVIDIA GeForce RTX 4080 n=1 arch=sm_90 cap=(8, 9) cuda=12.1 cudnn=8801
                         driver=551.23
08:23:43-415715 DEBUG    ONNX: version=1.17.0 provider=CUDAExecutionProvider, available=['TensorrtExecutionProvider',
                         'CUDAExecutionProvider', 'CPUExecutionProvider']
08:23:43-543211 DEBUG    Importing LDM
08:23:43-566194 DEBUG    Entering start sequence
08:23:43-569186 DEBUG    Initializing
08:23:43-594345 INFO     Available VAEs: path="models\VAE" items=0
08:23:43-596340 INFO     Disabled extensions: ['sd-webui-controlnet']
08:23:43-597337 DEBUG    Scanning diffusers cache: ['models\\Diffusers'] items=0 time=0.00
08:23:43-599332 DEBUG    Read: file="cache.json" json=1 bytes=542 time=0.000
08:23:43-603321 DEBUG    Read: file="metadata.json" json=132 bytes=391502 time=0.002
08:23:43-605316 INFO     Available models: path="models\Stable-diffusion" items=3 time=0.01
08:23:43-686236 DEBUG    Load extensions
08:23:43-842559 INFO     Extension: script='extensions-builtin\Lora\scripts\lora_script.py'
                         [2;36m08:23:43-839568[0m[2;36m [0m[34mINFO    [0m LoRA networks: [33mavailable[0m=[1;36m66[0m
                         [33mfolders[0m=[1;36m7[0m
08:23:44-173018 INFO     Extension: script='extensions-builtin\sd-webui-agent-scheduler\scripts\task_scheduler.py' Using
                         sqlite file: extensions-builtin\sd-webui-agent-scheduler\task_scheduler.sqlite3
08:23:44-437823 INFO     Extensions init time: 0.75 img2imgalt.py=0.11 sd-webui-agent-scheduler=0.29
                         stable-diffusion-webui-images-browser=0.25
08:23:44-453875 DEBUG    Read: file="html/upscalers.json" json=4 bytes=2672 time=0.000
08:23:44-456750 DEBUG    Read: file="extensions-builtin\sd-extension-chainner\models.json" json=24 bytes=2719 time=0.001
08:23:44-458928 DEBUG    chaiNNer models: path="models\chaiNNer" defined=24 discovered=0 downloaded=1
08:23:44-460923 DEBUG    Upscaler type=ESRGAN folder="models\ESRGAN" model="4xNMKDSuperscale_4xNMKDSuperscale"
                         path="models\ESRGAN\4xNMKDSuperscale_4xNMKDSuperscale.pth"
08:23:44-463915 DEBUG    Load upscalers: total=53 downloaded=2 user=1 time=0.02 ['None', 'Lanczos', 'Nearest',
                         'ChaiNNer', 'ESRGAN', 'LDSR', 'RealESRGAN', 'SCUNet', 'SD', 'SwinIR']
08:23:44-486051 DEBUG    Load styles: folder="models\styles" items=288 time=0.02
08:23:44-490778 DEBUG    Creating UI
08:23:44-492775 INFO     UI theme: name="black-teal" style=Auto base=sdnext.css
08:23:44-502060 DEBUG    UI initialize: txt2img
08:23:44-646246 DEBUG    List items: function=create_items
08:23:44-652033 DEBUG    Read: file="html\reference.json" json=36 bytes=19033 time=0.001
08:23:44-696914 DEBUG    Extra networks: page='model' items=39 subfolders=2 tab=txt2img
                         folders=['models\\Stable-diffusion', 'models\\Diffusers', 'models\\Reference'] list=0.04
                         thumb=0.01 desc=0.00 info=0.00 workers=4
08:23:44-727859 DEBUG    Extra networks: page='style' items=288 subfolders=1 tab=txt2img folders=['models\\styles',
                         'html'] list=0.04 thumb=0.00 desc=0.00 info=0.00 workers=4
08:23:44-729810 DEBUG    Extra networks: page='embedding' items=0 subfolders=0 tab=txt2img
                         folders=['models\\embeddings'] list=0.00 thumb=0.00 desc=0.00 info=0.00 workers=4
08:23:44-732255 DEBUG    Extra networks: page='hypernetwork' items=0 subfolders=0 tab=txt2img
                         folders=['models\\hypernetworks'] list=0.00 thumb=0.00 desc=0.00 info=0.00 workers=4
08:23:44-734279 DEBUG    Extra networks: page='vae' items=0 subfolders=0 tab=txt2img folders=['models\\VAE'] list=0.00
                         thumb=0.00 desc=0.00 info=0.00 workers=4
08:23:44-743480 DEBUG    Extra networks: page='lora' items=66 subfolders=0 tab=txt2img folders=['models\\Lora',
                         'models\\LyCORIS'] list=0.04 thumb=0.01 desc=0.03 info=0.05 workers=4
08:23:44-815258 DEBUG    UI initialize: img2img
08:23:45-006658 DEBUG    UI initialize: control models=models\control
08:23:45-248622 DEBUG    Read: file="ui-config.json" json=0 bytes=2 time=0.001
08:23:45-335444 DEBUG    Themes: builtin=11 gradio=5 huggingface=55
08:23:46-017092 INFO     Extension list is empty: refresh required
08:23:46-935370 DEBUG    Extension list: processed=7 installed=7 enabled=6 disabled=1 visible=7 hidden=0
08:23:47-105512 DEBUG    Root paths: ['C:\\ai\\automatic']
08:23:47-201018 INFO     Local URL: http://127.0.0.1:7860/
08:23:47-202697 DEBUG    Gradio functions: registered=2062
08:23:47-202697 INFO     Initializing middleware
08:23:47-206714 DEBUG    Creating API
08:23:47-381248 INFO     [AgentScheduler] Task queue is empty
08:23:47-383033 INFO     [AgentScheduler] Registering APIs
08:23:47-698611 DEBUG    Scripts setup: ['IP Adapters:0.013', 'AnimateDiff:0.008', 'X/Y/Z Grid:0.01', 'Face:0.012']
08:23:47-700313 DEBUG    Model metadata: file="metadata.json" no changes
08:23:47-702450 DEBUG    Model requested: fn=<lambda>
08:23:47-703450 INFO     Select: model="copaxTimelessxlSDXL1_v9 [c967070428]"
08:23:47-705444 DEBUG    Load model: existing=False
                         target=C:\ai\automatic\models\Stable-diffusion\copaxTimelessxlSDXL1_v9.safetensors info=None
08:23:47-736640 DEBUG    Desired Torch parameters: dtype=FP16 no-half=False no-half-vae=False upscast=False
08:23:47-737638 INFO     Setting Torch parameters: device=cuda dtype=torch.float16 vae=torch.float16 unet=torch.float16
                         context=no_grad fp16=True bf16=None
08:23:47-739632 DEBUG    Diffusers loading:
                         path="C:\ai\automatic\models\Stable-diffusion\copaxTimelessxlSDXL1_v9.safetensors"
08:23:47-740629 INFO     Autodetect: model="Stable Diffusion XL" class=StableDiffusionXLPipeline
                         file="C:\ai\automatic\models\Stable-diffusion\copaxTimelessxlSDXL1_v9.safetensors" size=6617MB
08:24:01-276029 DEBUG    Setting model: pipeline=StableDiffusionXLPipeline config={'low_cpu_mem_usage': True,
                         'torch_dtype': torch.float16, 'load_connected_pipeline': True, 'extract_ema': True,
                         'use_safetensors': True}
08:24:01-284007 DEBUG    Setting model: enable model CPU offload
08:24:01-300961 DEBUG    Setting model: enable VAE slicing
08:24:01-317916 INFO     Load embeddings: loaded=0 skipped=0 time=0.00
08:24:01-577223 DEBUG    GC: collected=2592 device=cuda {'ram': {'used': 14.13, 'total': 31.92}, 'gpu': {'used': 1.33,
                         'total': 15.99}, 'retries': 0, 'oom': 0} time=0.26
08:24:01-586199 INFO     Load model: time=13.61 load=13.61 native=1024 {'ram': {'used': 14.13, 'total': 31.92}, 'gpu':
                         {'used': 1.33, 'total': 15.99}, 'retries': 0, 'oom': 0}
08:24:01-589192 DEBUG    Save: file="config.json" json=28 bytes=1193 time=0.001
08:24:01-590188 DEBUG    Script callback init time: image_browser.py:ui_tabs=0.55 system-info.py:app_started=0.07
                         task_scheduler.py:app_started=0.33
08:24:01-592183 INFO     Startup time: 26.65 torch=4.95 olive=0.08 gradio=1.22 libraries=2.35 extensions=0.75
                         face-restore=0.08 ui-en=0.44 ui-txt2img=0.05 ui-img2img=0.06 ui-control=0.09 ui-settings=0.21
                         ui-extensions=1.51 ui-defaults=0.08 launch=0.17 api=0.09 app-started=0.40 checkpoint=13.89
08:26:00-432494 DEBUG    Server: alive=True jobs=1 requests=3 uptime=138 memory=14.13/31.92 backend=Backend.DIFFUSERS
                         state=idle
08:26:05-960318 DEBUG    Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=1024x1024 at
                         0x1BE62FBBBB0>]
08:26:12-080489 DEBUG    Control Processor loading: id="Canny" class=CannyDetector
08:26:12-081487 DEBUG    Control Processor loaded: id="Canny" class=CannyDetector time=0.00
08:26:25-050061 DEBUG    Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=1024x1024 at
                         0x1BE6333DF30>]
08:26:25-451015 DEBUG    Control process unit: i=1 process=Canny
08:26:25-471959 DEBUG    Setting model: enable model CPU offload
08:26:25-504873 DEBUG    Setting model: enable VAE slicing
08:26:25-567704 DEBUG    Control Processor: id="Canny" mode=RGB args={'low_threshold': 100, 'high_threshold': 200}
                         time=0.05
08:26:25-626546 DEBUG    Pipeline class change: original=StableDiffusionXLPipeline
                         target=StableDiffusionXLImg2ImgPipeline
08:26:27-642536 DEBUG    Diffuser pipeline: StableDiffusionXLImg2ImgPipeline task=DiffusersTaskType.IMAGE_2_IMAGE
                         set={'prompt_embeds': torch.Size([1, 77, 2048]), 'pooled_prompt_embeds': torch.Size([1, 1280]),
                         'negative_prompt_embeds': torch.Size([1, 77, 2048]), 'negative_pooled_prompt_embeds':
                         torch.Size([1, 1280]), 'guidance_scale': 6, 'generator': device(type='cuda'), 'output_type':
                         'latent', 'num_inference_steps': 41, 'eta': 1.0, 'guidance_rescale': 0.7, 'denoising_start':
                         None, 'denoising_end': None, 'image': <class 'list'>, 'strength': 0.5, 'parser': 'Full parser'}
08:26:28-361051 ERROR    Processing: args={'prompt_embeds': tensor([[[-3.7871, -2.3984,  4.4648,  ...,  0.1813,  0.4131,
                         -0.3018],
                                  [ 0.4226, -0.3618, -0.6392,  ..., -0.0181,  0.4641, -0.2773],
                                  [ 0.3103, -0.2625, -1.0811,  ..., -0.2450, -0.0486, -0.5288],
                                  ...,
                                  [-0.6016,  0.1482, -0.5249,  ...,  0.2576,  0.3901,  0.5088],
                                  [-0.5986,  0.1453, -0.5112,  ...,  0.2145,  0.2971,  0.4626],
                                  [-0.5737,  0.2087, -0.4653,  ...,  0.2725,  0.4282,  0.5391]]],
                                device='cuda:0', dtype=torch.float16), 'pooled_prompt_embeds': tensor([[ 0.7725,
                         -1.0781,  0.3010,  ...,  0.1122, -1.1787, -0.8364]],
                                device='cuda:0', dtype=torch.float16), 'negative_prompt_embeds': tensor([[[-3.7871,
                         -2.3984,  4.4648,  ...,  0.1813,  0.4131, -0.3018],
                                  [-0.3245, -0.4500, -0.5449,  ...,  0.2412, -0.4714,  0.7236],
                                  [-0.3909, -0.4441, -0.5767,  ..., -0.3765,  0.2375,  0.0093],
                                  ...,
                                  [-0.2925, -0.1221, -0.3633,  ...,  0.1962,  0.2172,  0.5312],
                                  [-0.3018, -0.1279, -0.3521,  ...,  0.1561,  0.1052,  0.4858],
                                  [-0.2854, -0.0909, -0.3086,  ...,  0.2114,  0.1792,  0.5811]]],
                                device='cuda:0', dtype=torch.float16), 'negative_pooled_prompt_embeds':
                         tensor([[-0.2759,  0.5405, -0.6997,  ..., -0.9009, -0.8379,  0.8662]],
                                device='cuda:0', dtype=torch.float16), 'guidance_scale': 6, 'generator':
                         [<torch._C.Generator object at 0x000001BE6358E230>], 'output_type': 'latent',
                         'callback_on_step_end': <function process_diffusers.<locals>.diffusers_callback at
                         0x000001BE635B8CA0>, 'callback_on_step_end_tensor_inputs': ['latents', 'prompt_embeds',
                         'negative_prompt_embeds', 'add_text_embeds', 'add_time_ids', 'negative_pooled_prompt_embeds',
                         'add_neg_time_ids'], 'num_inference_steps': 41, 'eta': 1.0, 'guidance_rescale': 0.7,
                         'denoising_start': None, 'denoising_end': None, 'image': [<PIL.Image.Image image mode=RGB
                         size=1024x1024 at 0x1BE5A7CDD80>], 'strength': 0.5} Input type (torch.FloatTensor) and weight
                         type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight
                         is a dense tensor
08:26:28-369031 ERROR    Processing: RuntimeError
╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮
│ C:\ai\automatic\modules\processing_diffusers.py:440 in process_diffusers                                             │
│                                                                                                                      │
│   439 │   │   t0 = time.time()                                                                                       │
│ ❱ 440 │   │   output = shared.sd_model(**base_args) # pylint: disable=not-callable                                   │
│   441 │   │   if isinstance(output, dict):                                                                           │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\torch\utils\_contextlib.py:115 in decorate_context                            │
│                                                                                                                      │
│   114 │   │   with ctx_factory():                                                                                    │
│ ❱ 115 │   │   │   return func(*args, **kwargs)                                                                       │
│   116                                                                                                                │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\pipelines\stable_diffusion_xl\pipeline_stable_diffusion_xl_img2img. │
│                                                                                                                      │
│   1311 │   │   # 6. Prepare latent variables                                                                         │
│ ❱ 1312 │   │   latents = self.prepare_latents(                                                                       │
│   1313 │   │   │   image,                                                                                            │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\pipelines\stable_diffusion_xl\pipeline_stable_diffusion_xl_img2img. │
│                                                                                                                      │
│    708 │   │   │   elif isinstance(generator, list):                                                                 │
│ ❱  709 │   │   │   │   init_latents = [                                                                              │
│    710 │   │   │   │   │   retrieve_latents(self.vae.encode(image[i : i + 1]), generator=generator[i])               │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\pipelines\stable_diffusion_xl\pipeline_stable_diffusion_xl_img2img. │
│                                                                                                                      │
│    709 │   │   │   │   init_latents = [                                                                              │
│ ❱  710 │   │   │   │   │   retrieve_latents(self.vae.encode(image[i : i + 1]), generator=generator[i])               │
│    711 │   │   │   │   │   for i in range(batch_size)                                                                │
│                                                                                                                      │
│                                               ... 4 frames hidden ...                                                │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\models\autoencoders\vae.py:143 in forward                           │
│                                                                                                                      │
│   142 │   │                                                                                                          │
│ ❱ 143 │   │   sample = self.conv_in(sample)                                                                          │
│   144                                                                                                                │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\module.py:1511 in _wrapped_call_impl                         │
│                                                                                                                      │
│   1510 │   │   else:                                                                                                 │
│ ❱ 1511 │   │   │   return self._call_impl(*args, **kwargs)                                                           │
│   1512                                                                                                               │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\module.py:1520 in _call_impl                                 │
│                                                                                                                      │
│   1519 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                       │
│ ❱ 1520 │   │   │   return forward_call(*args, **kwargs)                                                              │
│   1521                                                                                                               │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\conv.py:460 in forward                                       │
│                                                                                                                      │
│    459 │   def forward(self, input: Tensor) -> Tensor:                                                               │
│ ❱  460 │   │   return self._conv_forward(input, self.weight, self.bias)                                              │
│    461                                                                                                               │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\conv.py:456 in _conv_forward                                 │
│                                                                                                                      │
│    455 │   │   │   │   │   │   │   _pair(0), self.dilation, self.groups)                                             │
│ ❱  456 │   │   return F.conv2d(input, weight, bias, self.stride,                                                     │
│    457 │   │   │   │   │   │   self.padding, self.dilation, self.groups)                                             │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor
08:26:28-887993 INFO     Processed: images=0 time=3.25 its=0.00 memory={'ram': {'used': 8.84, 'total': 31.92}, 'gpu':
                         {'used': 3.11, 'total': 15.99}, 'retries': 0, 'oom': 0}
08:26:28-923896 INFO     Control: pipeline units=0 process=1 time=3.47 init=0.00 proc=0.17 ctrl=3.30 outputs=0
08:28:00-350764 DEBUG    Server: alive=True jobs=1 requests=14 uptime=258 memory=8.84/31.92 backend=Backend.DIFFUSERS
                         state=idle
08:30:00-183157 DEBUG    Server: alive=True jobs=1 requests=14 uptime=378 memory=8.84/31.92 backend=Backend.DIFFUSERS
                         state=idle

Backend

Diffusers

Branch

Dev

Model

SD-XL

Acknowledgements

vladmandic commented 8 months ago

this is due to part of the model being in vram and part in system ram. what are your settings for model move/offload? medvram or lowvram?

SAC020 commented 8 months ago

this is due to part of the model being in vram and part in system ram. what are your settings for model move/offload? medvram or lowvram?

I use medvram. Nvidia 4080 with 16GB of VRAM

vladmandic commented 8 months ago

control module is not yet compatible with automatic offloading that is enabled with medvram.

SAC020 commented 8 months ago

control module is not yet compatible with automatic offloading that is enabled with medvram.

is there something I can do?

vladmandic commented 8 months ago

until its fully implemented, you can not run with medvram and attempt manual move options instead of offloading. it will still save some vram, just not that much.

SAC020 commented 8 months ago

control module is not yet compatible with automatic offloading that is enabled with medvram.

Does this apply to inpainting too? Because I seem to get the same error on inpainting as well

vladmandic commented 8 months ago

this should be addressed in dev branch now.

SAC020 commented 8 months ago

this should be addressed in dev branch now.

thank you

vladmandic commented 8 months ago

its possible (likely?) there are borderline scenarios where offloading still causes issues, but they need to be taken one-at-the-time, so please report any...

SAC020 commented 8 months ago

Got a problem using Canny XS

log canny xs.txt

image

image

SAC020 commented 8 months ago

Same with Canny Mid XL

07:37:10-407893 DEBUG Control ControlNet model loading: id="Canny Mid XL" path="diffusers/controlnet-canny-sdxl-1.0-mid" 07:37:12-004216 DEBUG Control ControlNet model loaded: id="Canny Mid XL" path="diffusers/controlnet-canny-sdxl-1.0-mid" time=1.60 07:37:13-036848 DEBUG Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=1024x1024 at 0x22CE2759A50>] 07:37:13-660455 DEBUG Control ControlNet unit: i=1 process=Canny model=Canny Mid XL strength=1.0 guess=False start=0 end=1 07:37:13-667895 ERROR Control exception: It seems like you have activated sequential model offloading by calling enable_sequential_cpu_offload, but are now attempting to move the pipeline to GPU. This is not compatible with offloading. Please, move your pipeline .to('cpu') or consider removing the move altogether if you use sequential offloading. 07:37:13-670375 ERROR Control: ValueError ╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮ │ C:\ai\automatic\modules\ui_control.py:54 in generate_click │ │ │ │ 53 │ │ try: │ │ ❱ 54 │ │ │ for results in control_run(units, helpers.input_source, helpers.input_init, helpers.input_mask, ac │ │ 55 │ │ │ │ progress.record_results(job_id, results) │ │ │ │ C:\ai\automatic\modules\control\run.py:198 in control_run │ │ │ │ 197 │ │ p.task_args['guess_mode'] = p.guess_mode │ │ ❱ 198 │ │ instance = controlnet.ControlNetPipeline(selected_models, shared.sd_model) │ │ 199 │ │ pipe = instance.pipeline │ │ │ │ C:\ai\automatic\modules\control\units\controlnet.py:196 in init │ │ │ │ 195 │ │ │ │ controlnet=controlnet, # can be a list │ │ ❱ 196 │ │ │ ).to(pipeline.device) │ │ 197 │ │ elif detect.is_sd15(pipeline): │ │ │ │ C:\ai\automatic\venv\lib\site-packages\diffusers\pipelines\pipeline_utils.py:855 in to │ │ │ │ 854 │ │ if pipeline_is_sequentially_offloaded and device and torch.device(device).type == "cuda": │ │ ❱ 855 │ │ │ raise ValueError( │ │ 856 │ │ │ │ "It seems like you have activated sequential model offloading by calling enable_sequential_c │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ ValueError: It seems like you have activated sequential model offloading by callingenable_sequential_cpu_offload, but are now attempting to move the pipeline to GPU. This is not compatible with offloading. Please, move your pipeline.to('cpu')` or consider removing the move altogether if you use sequential offloading.

vladmandic commented 8 months ago

pushed another update

SAC020 commented 8 months ago

Not sure if related to the same root cause

image

PS C:\ai\automatic> .\webui.bat --medvram --debug
Using VENV: C:\ai\automatic\venv
16:42:52-845340 INFO     Starting SD.Next
16:42:52-848813 INFO     Logger: file="C:\ai\automatic\sdnext.log" level=DEBUG size=65 mode=create
16:42:52-849805 INFO     Python 3.10.11 on Windows
16:42:53-076344 INFO     Version: app=sd.next updated=2024-03-03 hash=f7ea7620
                         url=https://github.com/vladmandic/automatic/tree/dev
16:42:53-807877 INFO     Latest published version: 912237ecf7d5b3616a272f983f3f59cc405f64c3 2024-03-01T14:24:24Z
16:42:53-820774 INFO     Platform: arch=AMD64 cpu=Intel64 Family 6 Model 165 Stepping 5, GenuineIntel system=Windows
                         release=Windows-10-10.0.22631-SP0 python=3.10.11
16:39:21-659246 DEBUG    Control Processor loading: id="Canny" class=CannyDetector
16:39:21-671150 DEBUG    Control Processor loaded: id="Canny" class=CannyDetector time=0.02
16:39:27-950343 DEBUG    Control T2I-Adapter model loading: id="Canny XL" path="TencentARC/t2i-adapter-canny-sdxl-1.0"
16:39:29-125376 DEBUG    Control T2I-Adapter loaded: id="Canny XL" path="TencentARC/t2i-adapter-canny-sdxl-1.0"
                         time=1.18
16:39:30-159044 DEBUG    Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=1131x1131 at
                         0x1EF9B64B9D0>]
16:39:31-550636 DEBUG    Control T2I-Adapter unit: i=1 process=Canny model=Canny XL strength=1.0 factor=1.0
16:39:31-816987 DEBUG    Control T2I-Adapter pipeline: class=StableDiffusionXLAdapterPipeline time=0.26
16:39:33-646000 DEBUG    Setting model: enable model CPU offload
16:39:35-535778 DEBUG    Setting model: enable VAE slicing
16:39:35-549172 DEBUG    Image resize: input=<PIL.Image.Image image mode=RGB size=1131x1131 at 0x1EF9B64B9D0> mode=1
                         target=1024x1024 upscaler=ESRGAN 4xNMKDSuperscale_4xNMKDSuperscale function=control_run
16:39:35-594308 DEBUG    Control Processor: id="Canny" mode=L args={'low_threshold': 100, 'high_threshold': 200}
                         time=0.02
16:39:35-654376 WARNING  Pipeline class change failed: type=DiffusersTaskType.IMAGE_2_IMAGE
                         pipeline=StableDiffusionXLAdapterPipeline AutoPipeline can't find a pipeline linked to
                         StableDiffusionXLAdapterPipeline for None
Loading model: C:\ai\automatic\models\Lora\cydohd-mixed-sdxl-v13-civitai.safetensors ━━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
Loading model: C:\ai\automatic\models\Lora\cydohd-mixed-sdxl-v10-civitai.safetensors ━━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
Loading model: C:\ai\automatic\models\Lora\sacbf-mixed-sdxl-civitai-v3.safetensors ━━━━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
Loading model: C:\ai\automatic\models\Lora\sacbf-sacdalle-sdxl-civitai-v2.safetensors ━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
16:39:40-912154 INFO     LoRA apply: ['cydohd-mixed-sdxl-v13-civitai', 'cydohd-mixed-sdxl-v10-civitai',
                         'sacbf-mixed-sdxl-civitai-v3', 'sacbf-sacdalle-sdxl-civitai-v2'] patch=0.00 load=5.25
16:39:40-915131 INFO     Base: class=StableDiffusionXLAdapterPipeline
16:39:42-257427 DEBUG    Diffuser pipeline: StableDiffusionXLAdapterPipeline task=DiffusersTaskType.TEXT_2_IMAGE
                         set={'prompt_embeds': torch.Size([1, 77, 2048]), 'pooled_prompt_embeds': torch.Size([1, 1280]),
                         'negative_prompt_embeds': torch.Size([1, 77, 2048]), 'negative_pooled_prompt_embeds':
                         torch.Size([1, 1280]), 'guidance_scale': 1, 'generator': device(type='cuda'),
                         'num_inference_steps': 6, 'eta': 1.0, 'guidance_rescale': 0.7, 'denoising_end': None,
                         'output_type': 'latent', 'width': 1024, 'height': 1024, 'adapter_conditioning_scale': 1.0,
                         'image': [<PIL.Image.Image image mode=L size=1024x1024 at 0x1EF9B654F40>], 'parser': 'Full
                         parser'}
16:39:42-334307 DEBUG    Sampler: sampler="DPM SDE" config={'num_train_timesteps': 1000, 'beta_start': 0.00085,
                         'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'prediction_type': 'epsilon',
                         'use_karras_sigmas': True}
16:39:42-952182 ERROR    Processing: args={'prompt_embeds': tensor([[[0., 0., 0.,  ..., 0., 0., 0.],
                                  [0., 0., 0.,  ..., 0., 0., 0.],
                                  [0., 0., 0.,  ..., 0., 0., 0.],
                                  ...,
                                  [0., 0., 0.,  ..., 0., 0., 0.],
                                  [0., 0., 0.,  ..., 0., 0., 0.],
                                  [0., 0., 0.,  ..., 0., 0., 0.]]], device='cuda:0',
                                dtype=torch.float16), 'pooled_prompt_embeds': tensor([[-1.1396,  0.0978,  0.5444,  ...,
                         -1.0469, -2.2773, -0.4849]],
                                device='cuda:0', dtype=torch.float16), 'negative_prompt_embeds': tensor([[[-1.8027,
                         -0.7866,  1.5137,  ...,  0.0922,  0.3796, -0.2493],
                                  [-0.0142,  0.1144, -0.0961,  ...,  0.2178,  0.1851, -0.0518],
                                  [-0.0049, -0.1482, -0.2539,  ...,  0.1390,  0.0563,  0.2590],
                                  ...,
                                  [-0.3384,  0.2327, -0.7563,  ...,  0.4963,  0.3696,  1.2988],
                                  [-0.3381,  0.2325, -0.7515,  ...,  0.4409,  0.2812,  1.3398],
                                  [-0.3186,  0.2390, -0.7334,  ...,  0.5625,  0.3701,  1.4365]]],
                                device='cuda:0', dtype=torch.float16), 'negative_pooled_prompt_embeds':
                         tensor([[-0.3142,  0.1411,  0.8540,  ..., -1.2842, -1.6533,  0.5464]],
                                device='cuda:0', dtype=torch.float16), 'guidance_scale': 1, 'generator':
                         [<torch._C.Generator object at 0x000001EF98E858B0>], 'callback_steps': 1, 'callback': <function
                         process_diffusers.<locals>.diffusers_callback_legacy at 0x000001EDA5F01120>,
                         'num_inference_steps': 6, 'eta': 1.0, 'guidance_rescale': 0.7, 'denoising_end': None,
                         'output_type': 'latent', 'width': 1024, 'height': 1024, 'adapter_conditioning_scale': 1.0,
                         'image': [<PIL.Image.Image image mode=L size=1024x1024 at 0x1EF9B654F40>]} Given groups=1,
                         weight of size [320, 768, 3, 3], expected input[1, 256, 64, 64] to have 768 channels, but got
                         256 channels instead
16:39:42-960066 ERROR    Processing: RuntimeError
╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮
│ C:\ai\automatic\modules\processing_diffusers.py:417 in process_diffusers                                             │
│                                                                                                                      │
│   416 │   │   sd_models.move_model(shared.sd_model, devices.device)                                                  │
│ ❱ 417 │   │   output = shared.sd_model(**base_args) # pylint: disable=not-callable                                   │
│   418 │   │   if isinstance(output, dict):                                                                           │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\torch\utils\_contextlib.py:115 in decorate_context                            │
│                                                                                                                      │
│   114 │   │   with ctx_factory():                                                                                    │
│ ❱ 115 │   │   │   return func(*args, **kwargs)                                                                       │
│   116                                                                                                                │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\pipelines\t2i_adapter\pipeline_stable_diffusion_xl_adapter.py:1137  │
│                                                                                                                      │
│   1136 │   │   else:                                                                                                 │
│ ❱ 1137 │   │   │   adapter_state = self.adapter(adapter_input)                                                       │
│   1138 │   │   │   for k, v in enumerate(adapter_state):                                                             │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\module.py:1511 in _wrapped_call_impl                         │
│                                                                                                                      │
│   1510 │   │   else:                                                                                                 │
│ ❱ 1511 │   │   │   return self._call_impl(*args, **kwargs)                                                           │
│   1512                                                                                                               │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\module.py:1520 in _call_impl                                 │
│                                                                                                                      │
│   1519 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                       │
│ ❱ 1520 │   │   │   return forward_call(*args, **kwargs)                                                              │
│   1521                                                                                                               │
│                                                                                                                      │
│                                               ... 5 frames hidden ...                                                │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\module.py:1511 in _wrapped_call_impl                         │
│                                                                                                                      │
│   1510 │   │   else:                                                                                                 │
│ ❱ 1511 │   │   │   return self._call_impl(*args, **kwargs)                                                           │
│   1512                                                                                                               │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\module.py:1520 in _call_impl                                 │
│                                                                                                                      │
│   1519 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                       │
│ ❱ 1520 │   │   │   return forward_call(*args, **kwargs)                                                              │
│   1521                                                                                                               │
│                                                                                                                      │
│ C:\ai\automatic\extensions-builtin\Lora\networks.py:396 in network_Conv2d_forward                                    │
│                                                                                                                      │
│   395 │   network_apply_weights(self)                                                                                │
│ ❱ 396 │   return originals.Conv2d_forward(self, input)                                                               │
│   397                                                                                                                │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\conv.py:460 in forward                                       │
│                                                                                                                      │
│    459 │   def forward(self, input: Tensor) -> Tensor:                                                               │
│ ❱  460 │   │   return self._conv_forward(input, self.weight, self.bias)                                              │
│    461                                                                                                               │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\torch\nn\modules\conv.py:456 in _conv_forward                                 │
│                                                                                                                      │
│    455 │   │   │   │   │   │   │   _pair(0), self.dilation, self.groups)                                             │
│ ❱  456 │   │   return F.conv2d(input, weight, bias, self.stride,                                                     │
│    457 │   │   │   │   │   │   self.padding, self.dilation, self.groups)                                             │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Given groups=1, weight of size [320, 768, 3, 3], expected input[1, 256, 64, 64] to have 768 channels, but got 256 channels instead
16:39:43-446902 DEBUG    Control restored pipeline: class=StableDiffusionXLPipeline
16:39:43-449878 INFO     Processed: images=0 time=7.80 its=0.00 memory={'ram': {'used': 20.38, 'total': 63.92}, 'gpu':
                         {'used': 8.57, 'total': 15.99}, 'retries': 0, 'oom': 0}
16:39:43-508902 INFO     Control: pipeline units=1 process=1 time=11.96 init=0.27 proc=3.80 ctrl=7.89 outputs=0
16:39:43-510887 DEBUG    Control restored pipeline: class=StableDiffusionXLPipeline
vladmandic commented 8 months ago

no, that looks different.

SAC020 commented 8 months ago

no, that looks different.

ok, should I open a separate issue?

vladmandic commented 8 months ago

yes pls

SAC020 commented 7 months ago

This seems relevant to this thread ("Cannot generate a cpu tensor from a generator of type cuda.")

Control Canny => XS Canny

06:16:41-945643 INFO     Version: app=sd.next updated=2024-03-07 hash=9e93a63e
                         url=https://github.com/vladmandic/automatic/tree/dev
06:16:42-457057 INFO     Latest published version: bc4b633e8de3b9392595982e41673177dde1333d 2024-03-07T02:13:20Z
06:16:42-470450 INFO     Platform: arch=AMD64 cpu=Intel64 Family 6 Model 165 Stepping 5, GenuineIntel system=Windows
                         release=Windows-10-10.0.22631-SP0 python=3.10.11
06:10:56-057073 DEBUG    Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=862x862 at
                         0x15827F0D0F0>]
06:11:00-306309 DEBUG    Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=862x862 at
                         0x158294E4AC0>]
06:11:00-868288 DEBUG    Control ControlNet-XS unit: i=1 process=Canny model=Canny strength=1.0 guess=False start=0
                         end=1
06:11:01-073632 DEBUG    Control ControlNet-XS pipeline: class=StableDiffusionXLControlNetXSPipeline time=0.20
06:11:02-982710 DEBUG    Setting model: enable model CPU offload
06:11:06-126280 DEBUG    Setting model: enable VAE slicing
06:11:06-140168 DEBUG    Image resize: input=<PIL.Image.Image image mode=RGB size=862x862 at 0x158294E4AC0> mode=1
                         target=1024x1024 upscaler=ESRGAN 4xNMKDSuperscale_4xNMKDSuperscale function=control_run
06:11:06-142649 DEBUG    Upscaler cached: type=ESRGAN model=models\ESRGAN\4xNMKDSuperscale_4xNMKDSuperscale.pth
Upscaling ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:02
06:11:09-038407 DEBUG    Control Processor: id="Canny" mode=RGB args={'low_threshold': 100, 'high_threshold': 200}
                         time=0.03
Loading model: C:\ai\automatic\models\Lora\cydohd-mixed-sdxl-v10-civitai.safetensors ━━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
Loading model: C:\ai\automatic\models\Lora\sacbf-mixed-sdxl-civitai-v3.safetensors ━━━━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
Loading model: C:\ai\automatic\models\Lora\sacbf-sacdalle-sdxl-civitai-v2.safetensors ━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
06:11:13-411533 INFO     LoRA apply: ['cydohd-mixed-sdxl-v10-civitai', 'sacbf-mixed-sdxl-civitai-v3',
                         'sacbf-sacdalle-sdxl-civitai-v2'] patch=0.00 load=4.27
06:11:13-415501 INFO     Base: class=StableDiffusionXLControlNetXSPipeline
06:11:14-907684 DEBUG    Diffuser pipeline: StableDiffusionXLControlNetXSPipeline task=DiffusersTaskType.TEXT_2_IMAGE
                         set={'prompt_embeds': torch.Size([1, 77, 2048]), 'pooled_prompt_embeds': torch.Size([1, 1280]),
                         'negative_prompt_embeds': torch.Size([1, 77, 2048]), 'negative_pooled_prompt_embeds':
                         torch.Size([1, 1280]), 'guidance_scale': 1, 'generator': device(type='cuda'),
                         'num_inference_steps': 6, 'eta': 1.0, 'output_type': 'latent', 'width': 1024, 'height': 1024,
                         'image': [<PIL.Image.Image image mode=RGB size=1024x1024 at 0x1583A26E110>], 'parser': 'Full
                         parser'}
06:11:14-989747 DEBUG    Sampler: sampler="DPM SDE" config={'num_train_timesteps': 1000, 'beta_start': 0.00085,
                         'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'prediction_type': 'epsilon',
                         'use_karras_sigmas': True}
06:11:15-072083 ERROR    Processing: args={'prompt_embeds': tensor([[[0., 0., 0.,  ..., 0., 0., 0.],
                                  [0., 0., 0.,  ..., 0., 0., 0.],
                                  [0., 0., 0.,  ..., 0., 0., 0.],
                                  ...,
                                  [0., 0., 0.,  ..., 0., 0., 0.],
                                  [0., 0., 0.,  ..., 0., 0., 0.],
                                  [0., 0., 0.,  ..., 0., 0., 0.]]], device='cuda:0',
                                dtype=torch.float16), 'pooled_prompt_embeds': tensor([[-0.8276, -0.1184,  0.1528,  ...,
                         -0.6650, -1.8008, -0.8369]],
                                device='cuda:0', dtype=torch.float16), 'negative_prompt_embeds': tensor([[[-1.1641,
                         -0.4395,  1.1133,  ...,  0.1495,  0.3523, -0.2423],
                                  [-0.0290, -0.0774, -0.1122,  ...,  0.2145,  0.1422, -0.0188],
                                  [ 0.0709, -0.2002, -0.1653,  ...,  0.1697,  0.0793,  0.1204],
                                  ...,
                                  [-0.1682, -0.2725, -0.8081,  ...,  0.2095,  0.1077,  1.2705],
                                  [-0.1663, -0.2717, -0.8130,  ...,  0.1311,  0.0339,  1.3057],
                                  [-0.1633, -0.2549, -0.8037,  ...,  0.2678,  0.1084,  1.4229]]],
                                device='cuda:0', dtype=torch.float16), 'negative_pooled_prompt_embeds':
                         tensor([[-0.1111, -0.0195,  0.8696,  ..., -1.1689, -1.8076,  0.8428]],
                                device='cuda:0', dtype=torch.float16), 'guidance_scale': 1, 'generator':
                         [<torch._C.Generator object at 0x00000158299B5A50>], 'callback_steps': 1, 'callback': <function
                         process_diffusers.<locals>.diffusers_callback_legacy at 0x0000015611A820E0>,
                         'num_inference_steps': 6, 'eta': 1.0, 'output_type': 'latent', 'width': 1024, 'height': 1024,
                         'image': [<PIL.Image.Image image mode=RGB size=1024x1024 at 0x1583A26E110>]} Cannot generate a
                         cpu tensor from a generator of type cuda.
06:11:15-079523 ERROR    Processing: ValueError
╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮
│ C:\ai\automatic\modules\processing_diffusers.py:415 in process_diffusers                                             │
│                                                                                                                      │
│   414 │   │   sd_models.move_model(shared.sd_model, devices.device)                                                  │
│ ❱ 415 │   │   output = shared.sd_model(**base_args) # pylint: disable=not-callable                                   │
│   416 │   │   if isinstance(output, dict):                                                                           │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\torch\utils\_contextlib.py:115 in decorate_context                            │
│                                                                                                                      │
│   114 │   │   with ctx_factory():                                                                                    │
│ ❱ 115 │   │   │   return func(*args, **kwargs)                                                                       │
│   116                                                                                                                │
│                                                                                                                      │
│ C:\ai\automatic\modules\control\units\xs_pipe.py:933 in __call__                                                     │
│                                                                                                                      │
│    932 │   │   num_channels_latents = self.unet.config.in_channels                                                   │
│ ❱  933 │   │   latents = self.prepare_latents(                                                                       │
│    934 │   │   │   batch_size * num_images_per_prompt,                                                               │
│                                                                                                                      │
│ C:\ai\automatic\modules\control\units\xs_pipe.py:619 in prepare_latents                                              │
│                                                                                                                      │
│    618 │   │   if latents is None:                                                                                   │
│ ❱  619 │   │   │   latents = randn_tensor(shape, generator=generator, device=device, dtype=dtype)                    │
│    620 │   │   else:                                                                                                 │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\utils\torch_utils.py:66 in randn_tensor                             │
│                                                                                                                      │
│    65 │   │   elif gen_device_type != device.type and gen_device_type == "cuda":                                     │
│ ❱  66 │   │   │   raise ValueError(f"Cannot generate a {device} tensor from a generator of type {gen_device_type}.") │
│    67                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Cannot generate a cpu tensor from a generator of type cuda.
06:11:15-322729 DEBUG    Control restored pipeline: class=StableDiffusionXLPipeline
06:11:15-325706 INFO     Processed: images=0 time=6.19 its=0.00 memory={'ram': {'used': 8.48, 'total': 63.92}, 'gpu':
                         {'used': 2.89, 'total': 15.99}, 'retries': 0, 'oom': 0}
06:11:15-378779 INFO     Control: pipeline units=1 process=1 time=14.51 init=0.21 proc=8.06 ctrl=6.24 outputs=0
06:11:15-380268 DEBUG    Control restored pipeline: class=StableDiffusionXLPipeline
SAC020 commented 7 months ago

And I get the same after running Controlnet Canny XL twice. First time it finishes the job, second time it fails with the same error without me changing anything other than removing loras from the prompt

Upon a third attempt, I also get this

06:28:04-324070 ERROR    Model move: device=cuda It seems like you have activated sequential model offloading by calling
                         `enable_sequential_cpu_offload`, but are now attempting to move the pipeline to GPU. This is
                         not compatible with offloading. Please, move your pipeline `.to('cpu')` or consider removing
                         the move altogether if you use sequential offloading.
06:19:35-823273 DEBUG    Control Processor loading: id="Canny" class=CannyDetector
06:19:35-824761 DEBUG    Control Processor loaded: id="Canny" class=CannyDetector time=0.00
06:19:40-139866 DEBUG    Control ControlNet model loading: id="Canny XL" path="diffusers/controlnet-canny-sdxl-1.0"
06:19:42-230859 DEBUG    Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=862x862 at
                         0x146A519D1E0>]
06:19:42-828542 DEBUG    Control ControlNet unit: i=1 process=Canny model=None strength=1.0 guess=False start=0 end=1
06:19:54-874491 DEBUG    Control ControlNet model loaded: id="Canny XL" path="diffusers/controlnet-canny-sdxl-1.0"
                         time=14.73
06:19:54-896315 DEBUG    Control ControlNet pipeline: class=StableDiffusionXLControlNetPipeline time=12.07
06:19:57-290557 DEBUG    Setting model: enable model CPU offload
06:20:00-091457 DEBUG    Server: alive=True jobs=1 requests=21 uptime=178 memory=14.57/63.92 backend=Backend.DIFFUSERS
                         state=idle
06:20:01-151617 DEBUG    Setting model: enable VAE slicing
06:20:01-164016 DEBUG    Image resize: input=<PIL.Image.Image image mode=RGB size=862x862 at 0x146A519D1E0> mode=1
                         target=1024x1024 upscaler=ESRGAN 4xNMKDSuperscale_4xNMKDSuperscale function=control_run
06:20:01-244370 INFO     Upscaler loaded: type=ESRGAN model=models\ESRGAN\4xNMKDSuperscale_4xNMKDSuperscale.pth
Upscaling ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:02
06:20:04-381914 DEBUG    Control Processor: id="Canny" mode=RGB args={'low_threshold': 100, 'high_threshold': 200}
                         time=0.03
Loading model: C:\ai\automatic\models\Lora\cydohd-mixed-sdxl-v10-civitai.safetensors ━━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
Loading model: C:\ai\automatic\models\Lora\sacbf-mixed-sdxl-civitai-v3.safetensors ━━━━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
Loading model: C:\ai\automatic\models\Lora\sacbf-sacdalle-sdxl-civitai-v2.safetensors ━━━━━━━━━━━━━ 0.0/228.5 MB -:--:--
06:20:07-809805 INFO     LoRA apply: ['cydohd-mixed-sdxl-v10-civitai', 'sacbf-mixed-sdxl-civitai-v3',
                         'sacbf-sacdalle-sdxl-civitai-v2'] patch=0.00 load=3.32
06:20:07-819726 INFO     Base: class=StableDiffusionXLControlNetPipeline
06:20:10-267062 DEBUG    Diffuser pipeline: StableDiffusionXLControlNetPipeline task=DiffusersTaskType.TEXT_2_IMAGE
                         set={'prompt_embeds': torch.Size([1, 77, 2048]), 'pooled_prompt_embeds': torch.Size([1, 1280]),
                         'negative_prompt_embeds': torch.Size([1, 77, 2048]), 'negative_pooled_prompt_embeds':
                         torch.Size([1, 1280]), 'guidance_scale': 1, 'generator': device(type='cuda'),
                         'num_inference_steps': 6, 'eta': 1.0, 'output_type': 'latent', 'width': 1024, 'height': 1024,
                         'controlnet_conditioning_scale': 1.0, 'control_guidance_start': 0.0, 'control_guidance_end':
                         1.0, 'guess_mode': False, 'image': [<PIL.Image.Image image mode=RGB size=1024x1024 at
                         0x146851EBEB0>], 'parser': 'Full parser'}
06:20:10-369736 DEBUG    Sampler: sampler="DPM SDE" config={'num_train_timesteps': 1000, 'beta_start': 0.00085,
                         'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'prediction_type': 'epsilon',
                         'use_karras_sigmas': True}
Progress  6.59s/it █████▊                               17% 1/6 00:06 00:32 Base06:20:17-285355 DEBUG    VAE load: type=taesd model=models\TAESD\taesdxl_decoder.pth
Progress  1.44s/it ███████████████████████████████████ 100% 6/6 00:08 00:00 Base
06:20:26-149173 DEBUG    Control restored pipeline: class=StableDiffusionXLPipeline
06:20:26-170006 INFO     Processed: images=1 time=21.68 its=0.28 memory={'ram': {'used': 18.74, 'total': 63.92}, 'gpu':
                         {'used': 2.4, 'total': 15.99}, 'retries': 0, 'oom': 0}
06:20:26-999554 DEBUG    Saving temp: image="C:\Users\sebas\AppData\Local\Temp\gradio\tmpm6rvbxie.png"
                         resolution=1024x1024 size=1636463
06:20:27-141908 INFO     Control: pipeline units=1 process=1 time=43.39 init=12.07 proc=9.58 ctrl=21.74 outputs=1
06:20:27-144387 DEBUG    Control restored pipeline: class=StableDiffusionXLPipeline
06:20:41-587360 DEBUG    Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=862x862 at
                         0x1491C3EAEF0>]
06:20:42-160734 DEBUG    Control ControlNet unit: i=1 process=Canny model=Canny XL strength=1.0 guess=False start=0
                         end=1
06:20:44-573006 DEBUG    Control ControlNet pipeline: class=StableDiffusionXLControlNetPipeline time=2.41
06:20:44-642446 DEBUG    Setting model: enable model CPU offload
06:20:47-733728 DEBUG    Setting model: enable VAE slicing
06:20:47-746623 DEBUG    Image resize: input=<PIL.Image.Image image mode=RGB size=862x862 at 0x1491C3EAEF0> mode=1
                         target=1024x1024 upscaler=ESRGAN 4xNMKDSuperscale_4xNMKDSuperscale function=control_run
06:20:47-749104 DEBUG    Upscaler cached: type=ESRGAN model=models\ESRGAN\4xNMKDSuperscale_4xNMKDSuperscale.pth
Upscaling ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:02
06:20:49-983096 DEBUG    Control Processor: id="Canny" mode=RGB args={'low_threshold': 100, 'high_threshold': 200}
                         time=0.03
06:20:50-092712 INFO     Base: class=StableDiffusionXLControlNetPipeline
06:20:51-192517 DEBUG    Diffuser pipeline: StableDiffusionXLControlNetPipeline task=DiffusersTaskType.TEXT_2_IMAGE
                         set={'prompt_embeds': torch.Size([1, 77, 2048]), 'pooled_prompt_embeds': torch.Size([1, 1280]),
                         'negative_prompt_embeds': torch.Size([1, 77, 2048]), 'negative_pooled_prompt_embeds':
                         torch.Size([1, 1280]), 'guidance_scale': 1, 'generator': device(type='cuda'),
                         'num_inference_steps': 6, 'eta': 1.0, 'output_type': 'latent', 'width': 1024, 'height': 1024,
                         'controlnet_conditioning_scale': 1.0, 'control_guidance_start': 0.0, 'control_guidance_end':
                         1.0, 'guess_mode': False, 'image': [<PIL.Image.Image image mode=RGB size=1024x1024 at
                         0x1491C3F17B0>], 'parser': 'Full parser'}
06:20:51-275348 DEBUG    Sampler: sampler="DPM SDE" config={'num_train_timesteps': 1000, 'beta_start': 0.00085,
                         'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'prediction_type': 'epsilon',
                         'use_karras_sigmas': True}
06:20:51-320486 ERROR    Processing: args={'prompt_embeds': tensor([[[-3.8086, -2.2227,  4.2500,  ...,  0.1782,  0.4062,
                         -0.2651],
                                  [-0.3682,  0.9868, -0.2249,  ...,  0.6235,  0.0654, -0.1235],
                                  [-0.9121,  0.3735,  0.5811,  ...,  0.8818, -0.1223, -0.1892],
                                  ...,
                                  [-0.1880,  0.3547, -0.4631,  ...,  0.3096,  0.7734,  1.5244],
                                  [-0.1674,  0.3381, -0.4482,  ...,  0.2854,  0.6855,  1.5449],
                                  [-0.1707,  0.3345, -0.3923,  ...,  0.2959,  0.8662,  1.6211]]],
                                device='cuda:0', dtype=torch.float16), 'pooled_prompt_embeds': tensor([[-1.0850,
                         0.5771,  0.3389,  ..., -1.0879, -1.3359, -0.7358]],
                                device='cuda:0', dtype=torch.float16), 'negative_prompt_embeds': tensor([[[-3.8086,
                         -2.2227,  4.2500,  ...,  0.1782,  0.4062, -0.2651],
                                  [-0.2634, -0.4390, -0.5625,  ...,  0.4780, -0.5508,  0.8066],
                                  [-0.3071, -0.4297, -0.5918,  ..., -0.2783,  0.1602, -0.1042],
                                  ...,
                                  [-0.1926, -0.2605, -0.8789,  ...,  0.3950,  0.3237,  0.9126],
                                  [-0.2051, -0.2615, -0.8726,  ...,  0.3269,  0.2180,  0.9473],
                                  [-0.2271, -0.1841, -0.8457,  ...,  0.3677,  0.3582,  0.9487]]],
                                device='cuda:0', dtype=torch.float16), 'negative_pooled_prompt_embeds':
                         tensor([[-0.4524,  0.8706, -0.2712,  ..., -0.8403, -0.7446,  0.7886]],
                                device='cuda:0', dtype=torch.float16), 'guidance_scale': 1, 'generator':
                         [<torch._C.Generator object at 0x000001491C408530>], 'callback_on_step_end': <function
                         process_diffusers.<locals>.diffusers_callback at 0x000001491C3F8B80>,
                         'callback_on_step_end_tensor_inputs': ['latents', 'prompt_embeds', 'negative_prompt_embeds'],
                         'num_inference_steps': 6, 'eta': 1.0, 'output_type': 'latent', 'width': 1024, 'height': 1024,
                         'controlnet_conditioning_scale': 1.0, 'control_guidance_start': 0.0, 'control_guidance_end':
                         1.0, 'guess_mode': False, 'image': [<PIL.Image.Image image mode=RGB size=1024x1024 at
                         0x1491C3F17B0>]} Cannot generate a cpu tensor from a generator of type cuda.
06:20:51-330405 ERROR    Processing: ValueError
╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮
│ C:\ai\automatic\modules\processing_diffusers.py:415 in process_diffusers                                             │
│                                                                                                                      │
│   414 │   │   sd_models.move_model(shared.sd_model, devices.device)                                                  │
│ ❱ 415 │   │   output = shared.sd_model(**base_args) # pylint: disable=not-callable                                   │
│   416 │   │   if isinstance(output, dict):                                                                           │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\torch\utils\_contextlib.py:115 in decorate_context                            │
│                                                                                                                      │
│   114 │   │   with ctx_factory():                                                                                    │
│ ❱ 115 │   │   │   return func(*args, **kwargs)                                                                       │
│   116                                                                                                                │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\pipelines\controlnet\pipeline_controlnet_sd_xl.py:1262 in __call__  │
│                                                                                                                      │
│   1261 │   │   num_channels_latents = self.unet.config.in_channels                                                   │
│ ❱ 1262 │   │   latents = self.prepare_latents(                                                                       │
│   1263 │   │   │   batch_size * num_images_per_prompt,                                                               │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\pipelines\controlnet\pipeline_controlnet_sd_xl.py:816 in prepare_la │
│                                                                                                                      │
│    815 │   │   if latents is None:                                                                                   │
│ ❱  816 │   │   │   latents = randn_tensor(shape, generator=generator, device=device, dtype=dtype)                    │
│    817 │   │   else:                                                                                                 │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\utils\torch_utils.py:66 in randn_tensor                             │
│                                                                                                                      │
│    65 │   │   elif gen_device_type != device.type and gen_device_type == "cuda":                                     │
│ ❱  66 │   │   │   raise ValueError(f"Cannot generate a {device} tensor from a generator of type {gen_device_type}.") │
│    67                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Cannot generate a cpu tensor from a generator of type cuda.
06:20:51-860136 DEBUG    Control restored pipeline: class=StableDiffusionXLPipeline
06:20:51-861624 INFO     High memory utilization: GPU=81% RAM=30% {'ram': {'used': 19.42, 'total': 63.92}, 'gpu':
                         {'used': 12.91, 'total': 15.99}, 'retries': 0, 'oom': 0}
06:20:52-181570 DEBUG    GC: collected=218 device=cuda {'ram': {'used': 19.4, 'total': 63.92}, 'gpu': {'used': 3.07,
                         'total': 15.99}, 'retries': 0, 'oom': 0} time=0.32
06:20:52-184051 INFO     Processed: images=0 time=2.09 its=0.00 memory={'ram': {'used': 19.4, 'total': 63.92}, 'gpu':
                         {'used': 3.07, 'total': 15.99}, 'retries': 0, 'oom': 0}
06:20:52-221271 INFO     Control: pipeline units=1 process=1 time=10.06 init=2.41 proc=5.51 ctrl=2.14 outputs=0
06:20:52-223752 DEBUG    Control restored pipeline: class=StableDiffusionXLPipeline
06:28:00-585456 DEBUG    Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=540x540 at
                         0x1488A1D3B20>]
06:28:02-994759 DEBUG    Control input: type=PIL.Image input=[<PIL.Image.Image image mode=RGB size=540x540 at
                         0x1488A1D1FF0>]
06:28:03-527493 DEBUG    Control ControlNet unit: i=1 process=Canny model=Canny XL strength=1.0 guess=False start=0
                         end=1
06:28:04-319608 DEBUG    Control ControlNet pipeline: class=StableDiffusionXLControlNetPipeline time=0.79
06:28:04-324070 ERROR    Model move: device=cuda It seems like you have activated sequential model offloading by calling
                         `enable_sequential_cpu_offload`, but are now attempting to move the pipeline to GPU. This is
                         not compatible with offloading. Please, move your pipeline `.to('cpu')` or consider removing
                         the move altogether if you use sequential offloading.
06:28:04-355389 DEBUG    Setting model: enable model CPU offload
06:28:04-420861 DEBUG    Setting model: enable VAE slicing
06:28:04-433260 DEBUG    Image resize: input=<PIL.Image.Image image mode=RGB size=540x540 at 0x1488A1D1FF0> mode=1
                         target=1024x1024 upscaler=ESRGAN 4xNMKDSuperscale_4xNMKDSuperscale function=control_run
06:28:04-435561 DEBUG    Upscaler cached: type=ESRGAN model=models\ESRGAN\4xNMKDSuperscale_4xNMKDSuperscale.pth
Upscaling ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:01
06:28:05-795506 DEBUG    Control Processor: id="Canny" mode=RGB args={'low_threshold': 100, 'high_threshold': 200}
                         time=0.03
06:28:05-871394 INFO     Base: class=StableDiffusionXLControlNetPipeline
06:28:06-866215 DEBUG    Diffuser pipeline: StableDiffusionXLControlNetPipeline task=DiffusersTaskType.TEXT_2_IMAGE
                         set={'prompt_embeds': torch.Size([1, 77, 2048]), 'pooled_prompt_embeds': torch.Size([1, 1280]),
                         'negative_prompt_embeds': torch.Size([1, 77, 2048]), 'negative_pooled_prompt_embeds':
                         torch.Size([1, 1280]), 'guidance_scale': 1, 'generator': device(type='cuda'),
                         'num_inference_steps': 6, 'eta': 1.0, 'output_type': 'latent', 'width': 1024, 'height': 1024,
                         'controlnet_conditioning_scale': 1.0, 'control_guidance_start': 0.0, 'control_guidance_end':
                         1.0, 'guess_mode': False, 'image': [<PIL.Image.Image image mode=RGB size=1024x1024 at
                         0x1488A1D2F50>], 'parser': 'Full parser'}
06:28:06-941279 DEBUG    Sampler: sampler="DPM SDE" config={'num_train_timesteps': 1000, 'beta_start': 0.00085,
                         'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'prediction_type': 'epsilon',
                         'use_karras_sigmas': True}
06:28:06-986416 ERROR    Processing: args={'prompt_embeds': tensor([[[-3.8086, -2.2227,  4.2500,  ...,  0.1782,  0.4062,
                         -0.2651],
                                  [-0.3682,  0.9868, -0.2249,  ...,  0.6235,  0.0654, -0.1235],
                                  [-0.9121,  0.3735,  0.5811,  ...,  0.8818, -0.1223, -0.1892],
                                  ...,
                                  [-0.1880,  0.3547, -0.4631,  ...,  0.3096,  0.7734,  1.5244],
                                  [-0.1674,  0.3381, -0.4482,  ...,  0.2854,  0.6855,  1.5449],
                                  [-0.1707,  0.3345, -0.3923,  ...,  0.2959,  0.8662,  1.6211]]],
                                device='cuda:0', dtype=torch.float16), 'pooled_prompt_embeds': tensor([[-1.0850,
                         0.5771,  0.3389,  ..., -1.0879, -1.3359, -0.7358]],
                                device='cuda:0', dtype=torch.float16), 'negative_prompt_embeds': tensor([[[-3.8086,
                         -2.2227,  4.2500,  ...,  0.1782,  0.4062, -0.2651],
                                  [-0.2634, -0.4390, -0.5625,  ...,  0.4780, -0.5508,  0.8066],
                                  [-0.3071, -0.4297, -0.5918,  ..., -0.2783,  0.1602, -0.1042],
                                  ...,
                                  [-0.1926, -0.2605, -0.8789,  ...,  0.3950,  0.3237,  0.9126],
                                  [-0.2051, -0.2615, -0.8726,  ...,  0.3269,  0.2180,  0.9473],
                                  [-0.2271, -0.1841, -0.8457,  ...,  0.3677,  0.3582,  0.9487]]],
                                device='cuda:0', dtype=torch.float16), 'negative_pooled_prompt_embeds':
                         tensor([[-0.4524,  0.8706, -0.2712,  ..., -0.8403, -0.7446,  0.7886]],
                                device='cuda:0', dtype=torch.float16), 'guidance_scale': 1, 'generator':
                         [<torch._C.Generator object at 0x000001491C475E70>], 'callback_on_step_end': <function
                         process_diffusers.<locals>.diffusers_callback at 0x000001491C3DECB0>,
                         'callback_on_step_end_tensor_inputs': ['latents', 'prompt_embeds', 'negative_prompt_embeds'],
                         'num_inference_steps': 6, 'eta': 1.0, 'output_type': 'latent', 'width': 1024, 'height': 1024,
                         'controlnet_conditioning_scale': 1.0, 'control_guidance_start': 0.0, 'control_guidance_end':
                         1.0, 'guess_mode': False, 'image': [<PIL.Image.Image image mode=RGB size=1024x1024 at
                         0x1488A1D2F50>]} Cannot generate a cpu tensor from a generator of type cuda.
06:28:06-992864 ERROR    Processing: ValueError
╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮
│ C:\ai\automatic\modules\processing_diffusers.py:415 in process_diffusers                                             │
│                                                                                                                      │
│   414 │   │   sd_models.move_model(shared.sd_model, devices.device)                                                  │
│ ❱ 415 │   │   output = shared.sd_model(**base_args) # pylint: disable=not-callable                                   │
│   416 │   │   if isinstance(output, dict):                                                                           │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\torch\utils\_contextlib.py:115 in decorate_context                            │
│                                                                                                                      │
│   114 │   │   with ctx_factory():                                                                                    │
│ ❱ 115 │   │   │   return func(*args, **kwargs)                                                                       │
│   116                                                                                                                │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\pipelines\controlnet\pipeline_controlnet_sd_xl.py:1262 in __call__  │
│                                                                                                                      │
│   1261 │   │   num_channels_latents = self.unet.config.in_channels                                                   │
│ ❱ 1262 │   │   latents = self.prepare_latents(                                                                       │
│   1263 │   │   │   batch_size * num_images_per_prompt,                                                               │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\pipelines\controlnet\pipeline_controlnet_sd_xl.py:816 in prepare_la │
│                                                                                                                      │
│    815 │   │   if latents is None:                                                                                   │
│ ❱  816 │   │   │   latents = randn_tensor(shape, generator=generator, device=device, dtype=dtype)                    │
│    817 │   │   else:                                                                                                 │
│                                                                                                                      │
│ C:\ai\automatic\venv\lib\site-packages\diffusers\utils\torch_utils.py:66 in randn_tensor                             │
│                                                                                                                      │
│    65 │   │   elif gen_device_type != device.type and gen_device_type == "cuda":                                     │
│ ❱  66 │   │   │   raise ValueError(f"Cannot generate a {device} tensor from a generator of type {gen_device_type}.") │
│    67                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Cannot generate a cpu tensor from a generator of type cuda.
06:28:07-221520 DEBUG    Control restored pipeline: class=StableDiffusionXLPipeline
06:28:07-223504 INFO     Processed: images=0 time=1.36 its=0.00 memory={'ram': {'used': 18.73, 'total': 63.92}, 'gpu':
                         {'used': 2.87, 'total': 15.99}, 'retries': 0, 'oom': 0}
06:28:07-257729 INFO     Control: pipeline units=1 process=1 time=3.73 init=0.79 proc=1.54 ctrl=1.40 outputs=0
06:28:07-259712 DEBUG    Control restored pipeline: class=StableDiffusionXLPipeline
vladmandic commented 7 months ago

Are you on latest dev or master?

SAC020 commented 7 months ago

Are you on latest dev or master?

I am on dev, it was the latest version as of ~12h ago

06:55:17-321287 INFO     Version: app=sd.next updated=2024-03-07 hash=9e93a63e
                         url=https://github.com/vladmandic/automatic/tree/dev
vladmandic commented 7 months ago

made few more improvements just now, still ways to go. if/when problem occurs now it can typically be worked around by unloading controlnet and reloading it.