Open lanice opened 8 months ago
i can see from your log exactly what you describe - control unit contains both valid processor and model, but after preprocess finishes, generate never executes
but i've tried to reproduce and both xs and lite work fine for me. need more info on how to reproduce.
Is there anything I can do to provide more info? I.e., more/different logs, or testing a different workflow?
Btw I have the same issue for ControlNet, T2I Adapter, and Reference, so basically all Control types.
Logs only confirm what you're saying - it doesn't get triggered.
Try documenting workflow step-by-step until reproduction.
Workflow that leads to this issue for me:
Nothing out of the ordinary unfortunately in terms of workflow.
Yeah, thats as simple as it gets. And I can't reproduce.
no matter what i do, i still cant reproduce
I've come a bit closer to an idea of why this is happening, I think it has something to do with the workload and resources. As I mentioned in the first post, sometimes it would generate one image, but nothing after that until I restart the server. I have now tried different resolution and batch counts, and found the following:
So I assume something weird is going on with workloads that go close to my VRAM limit? 3060 with 12gb VRAM.
And I found one more oddity: After being in the state mentioned above ("control 100% Finishing"), if I then go over to the Text tab and try to generate any image (with default settings), the log gives the following error:
00:37:51-609760 INFO Applying hypertile: unet=256
00:37:51-779687 DEBUG Diffuser pipeline: StableDiffusionXLControlNetXSPipeline task=DiffusersTaskType.TEXT_2_IMAGE
set={'prompt_embeds': torch.Size([1, 77, 2048]), 'pooled_prompt_embeds': torch.Size([1, 1280]),
'negative_prompt_embeds': torch.Size([1, 77, 2048]), 'negative_pooled_prompt_embeds': torch.Size([1, 1280]),
'guidance_scale': 6, 'generator': device(type='cuda'), 'output_type': 'latent', 'num_inference_steps': 20,
'eta': 1.0, 'width': 512, 'height': 512, 'parser': 'Full parser'}
00:37:51-797122 ERROR Exception: image must be passed and be one of PIL image, numpy array, torch tensor, list of PIL images, list of
numpy arrays or list of torch tensors, but is <class 'NoneType'>
00:37:51-798167 ERROR Arguments: args=('task(xjp7x9w55mb7nbq)', 'a puppy dog', '', [], 20, 0, 0, True, False, False, 1, 1, 6, 6, 0.7,
0, 1, 1, -1.0, -1.0, 0, 0, 0, 512, 512, False, 0.5, 2, 'None', False, 20, 0, 0, 5, 0.8, '', '', 0, 0, 0, 0,
False, 4, 0.95, False, 0.6, 1, '#000000', 0, [], 0, 1, 'None', 'None', 'None', 'None', 0.5, 0.5, 0.5, 0.5,
None, None, None, None, 0, 0, 0, 0, 1, 1, 1, 1, 'None', 16, 'None', 1, True, 'None', 2, True, 1, 0, True,
'none', 3, 4, 0.25, 0.25, 3, 1, 1, 0.8, 8, 64, True, 1, 1, 0.5, 0.5, False, False, 'positive', 'comma', 0,
False, False, '', 'None', '', 1, '', 'None', True, 0, 'None', 2, True, 1, 0, 0, '', [], 0, '', [], 0, '', [],
False, True, False, False, False, False, 0, 5, 'all', 'all', 'all', '', '', '', '1', 'none', False, '', '',
'comma', '', True, '', '20', 'all', 'all', 'all', 'all', 0, '', True, '0', False, 'None', [], 'FaceID Base',
True, True, 1, 1, 1, 0.5, True, 'person', 1, 0.5, True) kwargs={}
00:37:51-801225 ERROR gradio call: TypeError
╭───────────────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────────────╮
│ /home/lanice/automatic/modules/call_queue.py:31 in f │
│ │
│ 30 │ │ │ try: │
│ ❱ 31 │ │ │ │ res = func(*args, **kwargs) │
│ 32 │ │ │ │ progress.record_results(id_task, res) │
│ │
│ /home/lanice/automatic/modules/txt2img.py:88 in txt2img │
│ │
│ 87 │ if processed is None: │
│ ❱ 88 │ │ processed = processing.process_images(p) │
│ 89 │ p.close() │
│ │
│ /home/lanice/automatic/modules/processing.py:187 in process_images │
│ │
│ 186 │ │ │ with context_hypertile_vae(p), context_hypertile_unet(p): │
│ ❱ 187 │ │ │ │ processed = process_images_inner(p) │
│ 188 │
│ │
│ /home/lanice/automatic/modules/processing.py:297 in process_images_inner │
│ │
│ 296 │ │ │ │ │ from modules.processing_diffusers import process_diffusers │
│ ❱ 297 │ │ │ │ │ x_samples_ddim = process_diffusers(p) │
│ 298 │ │ │ │ else: │
│ │
│ /home/lanice/automatic/modules/processing_diffusers.py:441 in process_diffusers │
│ │
│ 440 │ │ t0 = time.time() │
│ ❱ 441 │ │ output = shared.sd_model(**base_args) # pylint: disable=not-callable │
│ 442 │ │ if isinstance(output, dict): │
│ │
│ /home/lanice/automatic/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py:115 in decorate_context │
│ │
│ 114 │ │ with ctx_factory(): │
│ ❱ 115 │ │ │ return func(*args, **kwargs) │
│ 116 │
│ │
│ /home/lanice/automatic/modules/control/units/xs_pipe.py:856 in __call__ │
│ │
│ 855 │ │ # 1. Check inputs. Raise error if not correct │
│ ❱ 856 │ │ self.check_inputs( │
│ 857 │ │ │ prompt, │
│ │
│ /home/lanice/automatic/modules/control/units/xs_pipe.py:517 in check_inputs │
│ │
│ 516 │ │ ): │
│ ❱ 517 │ │ │ self.check_image(image, prompt, prompt_embeds) │
│ 518 │ │ else: │
│ │
│ /home/lanice/automatic/modules/control/units/xs_pipe.py:559 in check_image │
│ │
│ 558 │ │ ): │
│ ❱ 559 │ │ │ raise TypeError( │
│ 560 │ │ │ │ f"image must be passed and be one of PIL image, numpy array, torch tensor, list of PIL images, list of numpy │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: image must be passed and be one of PIL image, numpy array, torch tensor, list of PIL images, list of numpy arrays or list of torch tensors, but is <class 'NoneType'>
"100% finishing" is typical vram spike during vae decode. and yes, control modules are memory hungry.
attempting to run txt2img afterwards fails because pipeline is still set to be controlnet pipeline and it expects an image as input. and its still set because pipeline restore is done AFTER vae decode.
00:37:51-779687 DEBUG Diffuser pipeline: StableDiffusionXLControlNetXSPipeline task=DiffusersTaskType.TEXT_2_IMAGE
i could do a pipeline reset to base in txt2img, but it would do nothing to solve original problem with vram - not much i can do there.
"100% finishing" is typical vram spike during vae decode. and yes, control modules are memory hungry.
Makes sense. What I don't understand is: The vae decode actually succeeds, but it's still stuck on "100% finishing". In the screenshot I posted above, in the "Output" section, while being on the Gallery tab it looks like it's still processing, but if I switch to the Image tab (in the Output section), it shows me the finished image. And the finished image is saved on disk.
thats interesting. i'll take a look at that, but since its near-impossible for me to reproduce, not sure how much i can do.
i cannot reproduce this last issue.
is this still an issue? its been sitting as "cannot reproduce" for a while now and in the meantime there have been many app updateds.
Yes, same issue. Just updated to latest master, and tried it again. It generates an image (which is actually saved), but the UI is stuck on "Generate 100% Finishing", and I need to restart the server.
Actually, it's not just using controlnet, it happens when using the Control tab specifically. Same issue, when using just a text prompt with nothing else:
Using the Text tab instead works with no issues.
can you try using chrome?
Just tried it, same issue.
Issue Description
When using the Control tab, the preprocessor runs successfully, but it never continues to actually generate an image. Using Control XS Canny here, but also tested with Lite and the normal ControlNet.
In some cases, I had it that the very first image was generated, but whenever I hit "Generate" after the first image, it would not do anything anymore. But in recent tries it did not even generate the first image.
Using a fresh install of the dev branch btw. Also tried master, same issue. Will include a complete log with --debug and
SD_CONTROL_DEBUG=true SD_PROCESS_DEBUG=true
below.Version Platform Description
Starting SD.Next Logger: file="/home/lanice/automatic/sdnext.log" level=DEBUG size=64 mode=create Python 3.10.13 on Linux Version: app=sd.next updated=2024-02-11 hash=3f8da51e url=https://github.com/vladmandic/automatic.git/tree/dev Updating main repository Upgraded to version: 3f8da51e Sun Feb 11 13:44:39 2024 +0300 Platform: arch=x86_64 cpu= system=Linux release=6.7.2-zen1 python=3.10.13 nVidia CUDA toolkit detected: nvidia-smi present Startup: standard Extensions enabled: ['stable-diffusion-webui-images-browser', 'sd-extension-system-info', Extension preload: {'extensions-builtin': 0.0, 'extensions': 0.0} Command line args: ['--listen', '--port', '9000', '--insecure', '--debug', '--upgrade'] insecure=True Load packages: {'torch': '2.2.0+cu121', 'diffusers': '0.26.2', 'gradio': '3.43.2'} Engine: backend=Backend.DIFFUSERS compute=cuda device=cuda attention="Scaled-Dot-Product" mode=no_grad Device: device=NVIDIA GeForce RTX 3060 n=1 arch=sm_90 cap=(8, 6) cuda=12.1 cudnn=8902 driver=545.29.06
Relevant log output
Backend
Diffusers
Branch
Dev
Model
SD-XL
Acknowledgements