Open MysticDaedra opened 5 months ago
that's not a sampler thing, thats an overflow in vae - looks like vae baked in that model is not using fp16 fixed vae. set settings -> diffusers -> vae upcasting -> true or load fp15 vae explicitly.
after long conversation in discord, i still cannot reproduce this. i do believe there is something strange there, but cannot do much without reproducing it locally first.
@vladmandic same error with 'Euler a' sampler and it seems to happen randomly.
Open SD.Next, set VAE Model = None, Load a model "PonyDiffusionV6XL', configure generation settings ("Smiling girl" positive prompt, "Euler a" sampler, 20 steps, 5 cfg scale), run Generation - ok.
Load a testing model "Stable Diffusion". click Generate - ok Change steps from 20 to 40, click Generate - "functional.py:282: RuntimeWarning: invalid value encountered in cast npimg = (npimg * 255).astype(np.uint8)" ~4/40 steps progress
Change steps from 40 back to 20, click generate - same error at 4/20 steps progress
Load the pony model again, change steps to 75, generate - ok. Load test model, click generate - ok Check Full Quality, click generate - ok Check HiDiffusion, click generate - ok Add Adapter "Full Face", add a png as Input Image - ok Check Face Restore, click generate - "functional.py:282: RuntimeWarning: invalid value encountered in cast npimg = (npimg * 255).astype(np.uint8)" ~3/20 (why 20???) steps progress Uncheck "Face Restore", click generate - same error at 4/75 steps
@AznamirWoW that's a different issue as those are warning that are coming from live preview which never works at the same precision due to performance impact. they can be ignored, but if you want to pursue it further, create new issue for that.
@AznamirWoW that's a different issue as those are warning that are coming from live preview which never works at the same precision due to performance impact. they can be ignored, but if you want to pursue it further, create new issue for that.
Well, it is not a live preview issue. The result of the error is a blank image generated at the end, or if the 'face restore' fails, then a black square over the face.
that is not direct result of the error above at all. if you have blank image at the end, fine, then leave it here. i'm saying that specific error you've quoted comes from live preview.
Hello,I'm using with directml.I got a blank image.It's all white. After the progress,I saw something looks like error: E:\automatic\venv\lib\site-packages\torchvision\transforms\functional.py:282: RuntimeWarning: invalid value encountered in cast npimg = (npimg * 255).astype(np.uint8)
Using anything-v4.5 model,Euler a or DPM++ 2M.
After enabling "skip generation if NaN found in lanterns",whatever model or sampler I use,the console outputs "a NaNs is detected at step 0". I wonder why it produces NaNs even I have used full precision.
Today I used the original backend,with a slower load speed, it produced images normally.
Maybe it's a bug about diffusers.
Just wanted to update that UniPC on my installation is still bugged, I am still getting the same error in console:
D:\automatic\venv\Lib\site-packages\torchvision\transforms\functional.py:282: RuntimeWarning: invalid value encountered in cast
npimg = (npimg * 255).astype(np.uint8)
Autocast is enabled, set to FP16.
Just wanted to update that UniPC on my installation is still bugged, I am still getting the same error in console:
D:\automatic\venv\Lib\site-packages\torchvision\transforms\functional.py:282: RuntimeWarning: invalid value encountered in cast npimg = (npimg * 255).astype(np.uint8)
Autocast is enabled, set to FP16.
my comment from earlier still stands - i cannot reproduce. which means i need exact steps to reproduce and as much details as possible. here i don't even know which model or platform or gpu we're talking about.
Older now, bug/issue still exists (I've even seen other folks experiencing this issue on Discord).
I wish I knew what I could do to aid in cornering this bug. There are so many variables with all the different settings, I don't know how to be "precise" except to give you my config file or something. Doesn't matter what the prompt is, doesn't seem to matter what the model is (afaik, tested with SDXL, and SD3.5 Medium at least), with lora, without lora, with extensions, without extensions...
BTW, and I don't mean to be snarky or anything, but my hardware and model and all that... they are in the logs and whatnot above... But to make it simple, it's an RTX 3070 8gb, Windows 11 Professional, 32gb sysram, r7 5700X @ 4.8ghz, dev version 59cd08f5 now.
Here's a snippet of my latest log with the "relevant" bit. No noticeable errors that I can see. Full log also attached.
17:47:22-967784 INFO Applying hypertile: unet=448
17:47:22-993790 INFO XYZ grid start: images=135 grid=1 shape=27x5 cells=1 steps=1080
17:47:22-995792 DEBUG XYZ grid process: x=1/27 y=1/5 z=1/1 total=0.01
17:47:22-998792 DEBUG XYZ grid apply sampler: "UniPC"
17:47:22-999792 DEBUG XYZ grid apply field: steps=4
17:47:23-000792 INFO Applying hypertile: unet=448
Load network: D:\Stable Diffusion Files\Models\Loras\Microwaist_XL_v01.safetensors ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/103.7 MB -:--:--
17:47:23-852588 DEBUG LoRA name="Microwaist_XL_v01" type={'ModuleTypeLora'} keys=788
17:47:24-201199 DEBUG GC: utilization={'gpu': 55, 'ram': 10, 'threshold': 25} gc={'collected': 27009, 'saved': 0.03} before={'gpu': 4.4, 'ram': 3.21} after={'gpu': 4.37, 'ram': 3.21, 'retries': 0, 'oom': 0}
device=cuda fn=activate:load_networks time=0.34
17:47:24-203201 INFO Load network: type=LoRA apply=['Microwaist_XL_v01'] te=[1.5] unet=[[1.5, 1.5, 1.5]] dims=[None] load=1.17
17:47:24-209202 INFO Base: class=StableDiffusionXLPipeline
17:47:24-210202 DEBUG Sampler: sampler="UniPC" class="UniPCMultistepScheduler config={'num_train_timesteps': 1000, 'beta_start': 0.00085, 'beta_end': 0.012, 'beta_schedule': 'scaled_linear',
'prediction_type': 'epsilon', 'predict_x0': True, 'sample_max_value': 1.0, 'solver_order': 2, 'solver_type': 'bh2', 'thresholding': False, 'use_beta_sigmas': False,
'use_exponential_sigmas': False, 'use_karras_sigmas': False, 'lower_order_final': False, 'timestep_spacing': 'leading', 'final_sigmas_type': 'zero', 'rescale_betas_zero_snr': True}
17:47:24-536289 DEBUG GC: utilization={'gpu': 55, 'ram': 10, 'threshold': 25} gc={'collected': 127, 'saved': 0.0} before={'gpu': 4.37, 'ram': 3.21} after={'gpu': 4.37, 'ram': 3.21, 'retries': 0, 'oom': 0}
device=cuda fn=__init__:prepare_model time=0.32
17:47:25-684725 DEBUG GC: utilization={'gpu': 63, 'ram': 11, 'threshold': 25} gc={'collected': 2737, 'saved': 0.03} before={'gpu': 5.06, 'ram': 3.35} after={'gpu': 5.03, 'ram': 3.35, 'retries': 0, 'oom': 0}
device=cuda fn=encode:prepare_model time=0.3
17:47:25-687726 DEBUG Torch generator: device=cuda seeds=[1969483135]
17:47:25-688727 DEBUG Diffuser pipeline: StableDiffusionXLPipeline task=DiffusersTaskType.TEXT_2_IMAGE batch=1/1x1 set={'prompt_embeds': torch.Size([1, 77, 2048]), 'pooled_prompt_embeds': torch.Size([1,
1280]), 'negative_prompt_embeds': torch.Size([1, 77, 2048]), 'negative_pooled_prompt_embeds': torch.Size([1, 1280]), 'guidance_scale': 3, 'num_inference_steps': 4, 'eta': 1.0,
'guidance_rescale': 0.7, 'denoising_end': None, 'output_type': 'latent', 'width': 896, 'height': 1024, 'parser': 'native'}
Progress 2.01s/it ███████████████████████████████████ 100% 4/4 00:08 00:00 Base
17:47:34-160425 DEBUG GC: utilization={'gpu': 67, 'ram': 10, 'threshold': 25} gc={'collected': 248, 'saved': 0.91} before={'gpu': 5.34, 'ram': 3.24} after={'gpu': 4.43, 'ram': 3.24, 'retries': 0, 'oom': 0}
device=cuda fn=process_base:nextjob time=0.31
17:47:34-162425 DEBUG Init hires: upscaler="ESRGAN 4x Ultrasharp" sampler="DPM++ 3M" resize=1523x1740 upscale=1523x1740
17:47:34-163425 INFO Upscale: mode=1 upscaler="ESRGAN 4x Ultrasharp" context="Add with forward" resize=1523x1740 upscale=1523x1740
17:47:35-221920 DEBUG VAE decode: vae name="default" dtype=torch.bfloat16 device=cuda:0 upcast=False slicing=True tiling=True latents shape=torch.Size([1, 4, 128, 112]) dtype=torch.bfloat16 device=cuda:0
time=1.057
17:47:35-593443 DEBUG GC: utilization={'gpu': 52, 'ram': 19, 'threshold': 25} gc={'collected': 127, 'saved': 2.12} before={'gpu': 4.12, 'ram': 5.93} after={'gpu': 2.0, 'ram': 5.93, 'retries': 0, 'oom': 0}
device=cuda fn=resize_hires:vae_decode time=0.35
17:47:35-896024 DEBUG GC: utilization={'gpu': 25, 'ram': 19, 'threshold': 25} gc={'collected': 127, 'saved': 0.0} before={'gpu': 2.0, 'ram': 5.93} after={'gpu': 2.0, 'ram': 5.93, 'retries': 0, 'oom': 0}
device=cuda fn=upscale:begin time=0.3
17:47:35-947098 INFO Upscaler loaded: type=ESRGAN model=D:\Stable Diffusion Files\Models\ESRGAN\ESRGAN-UltraSharp-4x.pth
Upscaling ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:07
17:47:43-222081 DEBUG Upscaler unloaded: type=ESRGAN model=D:\Stable Diffusion Files\Models\ESRGAN\ESRGAN-UltraSharp-4x.pth
17:47:43-566105 DEBUG GC: utilization={'gpu': 30, 'ram': 19, 'threshold': 25} gc={'collected': 454, 'saved': 0.37} before={'gpu': 2.37, 'ram': 6.0} after={'gpu': 2.0, 'ram': 6.0, 'retries': 0, 'oom': 0}
device=cuda fn=upscale:do_upscale time=0.34
17:47:44-006723 DEBUG GC: utilization={'gpu': 25, 'ram': 19, 'threshold': 25} gc={'collected': 162, 'saved': 0.0} before={'gpu': 2.0, 'ram': 5.96} after={'gpu': 2.0, 'ram': 5.96, 'retries': 0, 'oom': 0}
device=cuda fn=upscale:end time=0.32
17:47:44-008724 DEBUG Image resize: input=<PIL.Image.Image image mode=RGB size=896x1024 at 0x22693193F50> width=1523 height=1740 mode="Fixed" upscaler="ESRGAN 4x Ultrasharp" context="Add with forward"
type=image result=<PIL.Image.Image image mode=RGB size=1523x1740 at 0x226CF436BD0> time=8.41 fn=process_hires:resize_hires
17:47:44-330322 DEBUG GC: utilization={'gpu': 25, 'ram': 19, 'threshold': 25} gc={'collected': 127, 'saved': 0.0} before={'gpu': 2.0, 'ram': 5.96} after={'gpu': 2.0, 'ram': 5.96, 'retries': 0, 'oom': 0}
device=cuda fn=process_hires:resize_hires time=0.32
17:47:44-753000 DEBUG GC: utilization={'gpu': 34, 'ram': 19, 'threshold': 25} gc={'collected': 162, 'saved': 0.75} before={'gpu': 2.75, 'ram': 5.96} after={'gpu': 2.0, 'ram': 5.96, 'retries': 0, 'oom': 0}
device=cuda fn=process_hires:nextjob time=0.33
0: 640x576 (no detections), 247.1ms
Speed: 2.0ms preprocess, 247.1ms inference, 0.0ms postprocess per image at shape (1, 3, 640, 576)
[-] ADetailer: nothing detected on image 1 with 1st settings.
0: 640x576 (no detections), 186.1ms
Speed: 2.0ms preprocess, 186.1ms inference, 0.0ms postprocess per image at shape (1, 3, 640, 576)
[-] ADetailer: nothing detected on image 1 with 2nd settings.
W0000 00:00:1731808065.771732 29784 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
W0000 00:00:1731808065.776372 23816 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
[-] ADetailer: nothing detected on image 1 with 3rd settings.
0: 640x576 (no detections), 528.4ms
Speed: 3.0ms preprocess, 528.4ms inference, 0.0ms postprocess per image at shape (1, 3, 640, 576)
[-] ADetailer: nothing detected on image 1 with 4th settings.
17:47:46-940206 INFO Save: image="D:\Stable Diffusion Files\Outputs\text\12826-mklanRealistic_mklanRealxlV1HSD-a full body photorealistic photograph of a young.png" type=PNG width=1523 height=1740
size=10756
17:47:47-283735 DEBUG GC: utilization={'gpu': 25, 'ram': 21, 'threshold': 25} gc={'collected': 8835, 'saved': 0.0} before={'gpu': 2.0, 'ram': 6.57} after={'gpu': 2.0, 'ram': 6.57, 'retries': 0, 'oom': 0}
device=cuda fn=process_images:process_images_inner time=0.34
17:47:47-307741 INFO Processed: images=1 its=0.16 time=24.28 timers={'gc': 3.57, 'init': 1.2, 'encode': 1.47, 'args': 1.49, 'move': 0.02, 'pipeline': 8.03, 'hires': 11.0, 'post': 2.55} memory={'ram':
{'used': 6.57, 'total': 31.9}, 'gpu': {'used': 2.0, 'total': 8.0}, 'retries': 0, 'oom': 0}
EDIT: FWIW, the black appears (via live preview) immediately after inference when VAE decoding starts. Something causing a problem with the decoding or something? VAE is set to Automatic, and I have both the "fixed" VAE and the distilled VAE in the VAE folder as options for it to pick. I might try using different VAEs in a future test, running a grid atm.
On SDXL, tested base, fixed, and "low memory" VAE, black image still appears at beginning of VAE decode.
i believe there is an issue, but its not a general one and its not something i can reproduce. and without reproduction, i cannot fix it.
so, start with absolute minimal reproduction.
I'm not sure I know enough to even be able to do a bare minimum workflow, but here goes:
I can't really think of what else I can do to "bare minimum" on my GPU, if I turn anything else off (or on?), I'll start having issues running SDXL at all on my GPU. Again, only 8gb VRAM. Many of the settings I'm using are explicitly because without them I will get OOM or massively degraded performance.
Not 100% sure which precision type is being used, I just looked in my settings and it is set to "auto"
from your log:
2024-11-16 11:59:02,387 | sd | INFO | devices | Torch parameters: backend=cuda device=cuda config=Auto dtype=torch.bfloat16 vae=torch.bfloat16 unet=torch.bfloat16 context=no_grad nohalf=False nohalfvae=False upscast=False deterministic=True test-fp16=True test-bf16=True optimization="Scaled-Dot-Product"
Quantization is currently enabled, but black image occurred without quantization as well. NNCF.
i believe you, but please try to understand my point of view - i need a clean log. if i see hypertile or nncf or detailer or hires in the log and they have no relevance on the issue, it just makes any kind of analysis that much harder. so once again, please reproduce without anything that is not relevant - just to have as simple as possible log. if you need medvram, thats fine. i never said, disable everything - i said disable everything that is not relevant and/or needed.
Issue Description
Preview shows an image forming, but after finishing generation it turns black, and an error message is posted in console:
Version Platform Description
Relevant log output
Backend
Diffusers
Branch
Dev
Model
SD-XL
Acknowledgements