vladmandic / automatic

SD.Next: Advanced Implementation of Stable Diffusion and other Diffusion-based generative image models
https://github.com/vladmandic/automatic
GNU Affero General Public License v3.0
5.71k stars 423 forks source link

[Issue]: An issue about ERSGAN upscalers - they won't engage because of some weird numpy error. #2165

Closed mart-hill closed 1 year ago

mart-hill commented 1 year ago

Issue Description

This happens as soon as I generate an image as first pass gets finished (second pass uses ERSGAN scaler selected [latent ones seems to be fine]). Enqueue is also influenced by this bug. RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead. This generation error was after using the ⇦ button though, I just enabled all the additional views in the UI ('Batch' 'Seed details' 'Advanced' 'Second pass') to actually fully use the previous generation params.

Version Platform Description

23:47:00-065994 INFO     Starting SD.Next
23:47:00-078494 INFO     Python 3.10.9 on Windows
23:47:00-605493 INFO     Version: app=sd.next updated=2023-09-10 hash=5a649f95 url=https://github.com/vladmandic/automatic/tree/master
23:47:01-639823 INFO     Platform: arch=AMD64 cpu=Intel64 Family 6 Model 85 Stepping 4, GenuineIntel system=Windows
                         release=Windows-10-10.0.19045-SP0 python=3.10.9
23:47:01-658325 INFO     nVidia CUDA toolkit detected
23:47:02-753827 INFO     Extensions: disabled=['sd-webui-bayesian-merger', 'sd-webui-additional-networks',
                         'stable-diffusion-webui-aesthetic-gradients', 'stable-diffusion-webui-visualize-cross-attention-extension']
23:47:02-755823 INFO     Extensions: enabled=['a1111-sd-webui-lycoris', 'clip-interrogator-ext', 'LDSR', 'Lora', 'model-keyword',
                         'multidiffusion-upscaler-for-automatic1111', 'ScuNET', 'sd-dynamic-thresholding', 'sd-extension-aesthetic-scorer',
                         'sd-extension-steps-animation', 'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sd-webui-controlnet',
                         'sd-webui-model-converter', 'seed_travel', 'stable-diffusion-webui-images-browser', 'stable-diffusion-webui-rembg',
                         'SwinIR'] extensions-builtin
23:47:02-769824 INFO     Extensions: enabled=['a1111-sd-webui-tagcomplete', 'ABG_extension', 'adetailer',
                         'CFG-Schedule-for-Automatic1111-SD', 'embedding-inspector', 'novelai-2-local-prompt',
                         'openOutpaint-webUI-extension', 'openpose-editor', 'prompt-fusion-extension', 'PXL8', 'sd-dynamic-prompts',
                         'sd-face-editor', 'sd-infinity-grid-generator-script', 'SD-latent-mirroring', 'sd-model-preview-xd', 'sd-pixel',
                         'sd-webui-ar', 'sd-webui-aspect-ratio-helper', 'sd-webui-check-tensors', 'sd-webui-color-enhance',
                         'sd-webui-infinite-image-browsing', 'sd-webui-lora-block-weight', 'sd-webui-neutral-prompt',
                         'sd-webui-openpose-editor', 'sd-webui-pixelart', 'sd-webui-regional-prompter', 'sd-webui-supermerger',
                         'sdweb-merge-block-weighted-gui', 'sdweb-merge-board', 'sd_default_negative', 'stable-diffusion-webui-anti-burn',
                         'stable-diffusion-webui-cafe-aesthetic', 'stable-diffusion-webui-dumpunet',
                         'stable-diffusion-webui-embedding-merge', 'stable-diffusion-webui-model-toolkit',
                         'stable-diffusion-webui-pixelization', 'stable-diffusion-webui-promptgen',
                         'stable-diffusion-webui-Prompt_Generator', 'stable-diffusion-webui-sonar', 'stable-diffusion-webui-text2prompt',
                         'stable-diffusion-webui-two-shot', 'tagger', 'TokenMixer', 'ultimate-upscale-for-automatic1111', 'Umi-AI',
                         'weight_gradient'] extensions

Relevant log output

00:01:31-619433 ERROR    gradio call: RuntimeError
╭─────────────────────────────────────────────────── Traceback (most recent call last) ────────────────────────────────────────────────────╮
│ X:\AI\automatic\modules\call_queue.py:34 in f                                                                                            │
│                                                                                                                                          │
│   33 │   │   │   try:                                                                                                                    │
│ ❱ 34 │   │   │   │   res = func(*args, **kwargs)                                                                                         │
│   35 │   │   │   │   progress.record_results(id_task, res)                                                                               │
│                                                                                                                                          │
│ X:\AI\automatic\modules\txt2img.py:65 in txt2img                                                                                         │
│                                                                                                                                          │
│   64 │   if processed is None:                                                                                                           │
│ ❱ 65 │   │   processed = processing.process_images(p)                                                                                    │
│   66 │   p.close()                                                                                                                       │
│                                                                                                                                          │
│ X:\AI\automatic\modules\processing.py:622 in process_images                                                                              │
│                                                                                                                                          │
│    621 │   │   else:                                                                                                                     │
│ ❱  622 │   │   │   res = process_images_inner(p)                                                                                         │
│    623 │   finally:                                                                                                                      │
│                                                                                                                                          │
│ X:\AI\automatic\extensions-builtin\sd-webui-controlnet\scripts\batch_hijack.py:42 in processing_process_images_hijack                    │
│                                                                                                                                          │
│    41 │   │   │   # we are not in batch mode, fallback to original function                                                              │
│ ❱  42 │   │   │   return getattr(processing, '__controlnet_original_process_images_inner')(p,                                            │
│    43                                                                                                                                    │
│                                                                                                                                          │
│ X:\AI\automatic\modules\processing.py:759 in process_images_inner                                                                        │
│                                                                                                                                          │
│    758 │   │   │   │   with devices.without_autocast() if devices.unet_needs_upcast else device                                          │
│ ❱  759 │   │   │   │   │   samples_ddim = p.sample(conditioning=c, unconditional_conditioning=u                                          │
│    760 │   │   │   │   x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(                                          │
│                                                                                                                                          │
│                                                         ... 2 frames hidden ...                                                          │
│                                                                                                                                          │
│ X:\AI\automatic\modules\images.py:231 in resize                                                                                          │
│                                                                                                                                          │
│   230 │   │   │   │   upscaler = upscalers[0]                                                                                            │
│ ❱ 231 │   │   │   im = upscaler.scaler.upscale(im, scale, upscaler.data_path)                                                            │
│   232 │   │   if im.width != w or im.height != h:                                                                                        │
│                                                                                                                                          │
│ X:\AI\automatic\modules\upscaler.py:60 in upscale                                                                                        │
│                                                                                                                                          │
│    59 │   │   │   shape = (img.width, img.height)                                                                                        │
│ ❱  60 │   │   │   img = self.do_upscale(img, selected_model)                                                                             │
│    61 │   │   │   if shape == (img.width, img.height):                                                                                   │
│                                                                                                                                          │
│ X:\AI\automatic\modules\esrgan_model.py:150 in do_upscale                                                                                │
│                                                                                                                                          │
│   149 │   │   model.to(devices.device_esrgan)                                                                                            │
│ ❱ 150 │   │   img = esrgan_upscale(model, img)                                                                                           │
│   151 │   │   return img                                                                                                                 │
│                                                                                                                                          │
│ X:\AI\automatic\modules\esrgan_model.py:224 in esrgan_upscale                                                                            │
│                                                                                                                                          │
│   223 │   │   │                                                                                                                          │
│ ❱ 224 │   │   │   output = upscale_without_tiling(model, tile)                                                                           │
│   225 │   │   │   scale_factor = output.width // tile.width                                                                              │
│                                                                                                                                          │
│ X:\AI\automatic\modules\esrgan_model.py:204 in upscale_without_tiling                                                                    │
│                                                                                                                                          │
│   203 │   │   output = model(img)                                                                                                        │
│ ❱ 204 │   output = output.squeeze().float().cpu().clamp_(0, 1).numpy()                                                                   │
│   205 │   output = 255. * np.moveaxis(output, 0, 2)                                                                                      │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.

Backend

Original

Model

SD 1.5

Acknowledgements

vladmandic commented 1 year ago

i haven't seen this issue before, but fix seems to be easy. fixed.

mart-hill commented 1 year ago

This happened just on this commit, the previous one was fine. I'll test it ASAP! :) BTW, I like how you fixed the pixelated output of the second-pass images, when the denoise is set at low value, like 0.35, and the upscaler is a 'Latent' one. The images are all nice and sharp now. Previously, to alleviate that, I was using ERSGAN scalers.🙂

vladmandic commented 1 year ago

thats really strange as there were pretty much zero changes anywhere near that code.

mart-hill commented 1 year ago

That's really interesting. Would that be one of the add-ons? I didn't see any changes at the start-up with numpy though.

vladmandic commented 1 year ago

ah, i got it. missing default value for new experimental setting -> optimizations -> torch inference mode

mart-hill commented 1 year ago

Sorry, that I'm finding such obscure bugs and make more work for you... 🙂

vladmandic commented 1 year ago

naah, better you than someone else! but yeah, you're fast! 🥇

mart-hill commented 1 year ago

I noticed one strange thing about ERSGAN upscalers use in second pass after the numpy fix - the second pass doesn't actually kick in at all! It's just the "raw" upscale (I set it at x2) that takes place now. Look, I have 40 steps for first pass, and 20 for the second, and this was the first generated image after a startup (I "unlocked" the options the UI hides as default, of course). 🙂

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:08<00:00,  4.50it/s]
Loading CLiP model ViT-L/14
02:03:14-335446 INFO     Processed: images=1 time=40.10s its=1.00 memory={'ram': {'used': 7.79, 'total': 63.68}, 'gpu': {'used': 4.77,
                         'total': 24.0}, 'retries': 0, 'oom': 0}

Latent upscalers do initiate the second pass correctly (something is choking the VRAM in VAE phase, it's probably --no-half-vae):


100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:06<00:00,  6.26it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:20<00:00,  1.03s/it]
02:04:32-472946 INFO     GPU high memory utilization: 100% {'ram': {'used': 10.23, 'total': 63.68}, 'gpu': {'used': 24.0, 'total': 24.0},
                         'retries': 0, 'oom': 0}
02:04:37-233946 INFO     Processed: images=1 time=39.16s its=1.02 memory={'ram': {'used': 6.76, 'total': 63.68}, 'gpu': {'used': 3.89,
                         'total': 24.0}, 'retries': 0, 'oom': 0}

Edit: Oh. Maybe because of that? I had 'no-grad' checked, when inference mode option was first introduced.🙂 image

vladmandic commented 1 year ago

I noticed one strange thing about ERSGAN upscalers use in second pass after the numpy fix - the second pass doesn't actually kick in at all!

that is intentional, i just made that change. nothing to do with inference mode. if you've selected latent upscale, you'll use hires. if you're selected some standalone upscaler, then its exactly what you selected and hires should not kick in. i understand that is change in behavior, but what is the point of running something like esrgan+hires.

mart-hill commented 1 year ago

Could you restore the functionality with ERSGAN, where hires also works, like before? It's actually vital for very much every generation (not only) I do. The quality of the hires-ed images treated with ERSGAN model 4x_RealisticRescaler_100000_G or 4x_foolhardy_Remacri is better, than even with latent method, I think.🙂

For "raw" upscaling i would use either Topaz Photo AI (upscaler part, very good), or the Process tab. Pretty please? That's a HUGE change in behavior, and I'm sure it'll cause an uproar - many models rely on ERSGAN hires-ing, even if subtle with low 'denoise' value. Would it be possible to set it as optional in settings? :)

vladmandic commented 1 year ago

i'll add "force hires" checkbox that runs hires regardless of upscaler. but hires was designed to work with latent upscaler only, the rest is who-knows-what.

mart-hill commented 1 year ago

Thank you 🙂 Actutally, you can tell that I didn't know about it, and I used these two mentioned ERSGAN models practically every time with hires option, especially when I noticed pixelated outputs with low 'denoise' value (when I didn't want drastic changes to the hires-ed image, because 'basic' image was fine already) in A1111's (and yours and UX forks) while using latent upscalers back in the day.

mart-hill commented 1 year ago

Would this also change how the metadata from the image are "pasted" back to the txt2img or img2img? Because, now, hi-res would be ignored for practically every image I generated till now (because of hires+ERSGAN), I wonder if that part would ever be "pasted back" into the appropriate fields... 🙂

vladmandic commented 1 year ago

new hires logic is now live, check it out. details in changelog.

mart-hill commented 1 year ago

Thank you, gotta do it ASAP! I also got an Intel ARC, so I could test things on a separate machine :)

mart-hill commented 1 year ago

It works! Thank you. 🙂 There's a bit of "latency" between what's already done, and what's the progress bar in the UI and in the shell is showing - is that because of Gradio changes? Loading model also doesn't show progress bar while at it and while calculating the checksum, the bar just pops in when it's done. :)

vladmandic commented 1 year ago

loading model which progress bar? which model type? you need to be more specific.

mart-hill commented 1 year ago

Oh, sorry, just loading "original backend" model (not the VAE or the refiner, I didn't test the diffusers pipeline yet), which is yet to be checksum-ed, or when I change the model - the progress bar while calculating the checksum now just pops in after it ends. :) I think the similar thing happens now with progress bar while rendering UniPC-sampled image. All progress bars I'm mentioning, are in the PowerShell. But the UI one also behaves a bit different now (when there's second pass used and the second sampler is UniPC).

vladmandic commented 1 year ago

i cannot reproduce command line progress bar issues when using ms terminal. regarding browser progress bar, yeah, there are some tweaks needed there, but not highest priority.

mart-hill commented 1 year ago

That's okay, it's just a very minor quirk :)