Open StratholmeBurns opened 10 months ago
i've never seen this before and cannot reproduce (not to mention that's the function that executes on each and every image ever generated, so its near-impossible that it never occurred so far).
and you haven't provided platform information?
Sorry, forgot. Here :Win10, Firefox, AMD Radeon RX 7900 XT, AMD Ryzen 7 5800X 8-Core Processor
If it helps, I was succesfully using the automatic111 repo on a nvidia 1070. Started using your repo after i switched to amd. Had similiar problems with inpainting just turning to grey pictures as well, which i just bruteforce fixxed by reinstalling 2 or 3 times. Hasnt worked for this issue yet sadly x(
If you never seen this issue, that is quite worrying im afraid, i do wonder where i fucked up then. Cause sometimes i can generate 2-4 images and then it randomly breaks at 20,40 or 90%, pretty much randomly. And the moment it does, all generations dont work anymore till i restart.
I have basically the same issue in processing.py:733 with same exact error where if a train a new lora with Kohya_ss on my Nvidia 3060 12gb, windows 10, on Opera Gx, Intel i7-11700 all of my images are just black with no change.
@StratholmeBurns the fact that it works, but breaks after couple of generations points that it could be that you did nothing wrong. amd on windows is a painful combo - rocm was historically just for linux, so for windows users are forced to rely on directml which is 3x slower (at best) and not well maintained (last update was in april).
amd finally released rocm for windows recently, so i'm hoping we'll soon have torch-rocm and actual native support for amd gpus.
@Accy587 maybe in the same place, but very different reason. something in the lora is causing pixel number value during generate to be a non-number - likely division by zero or overflow - basically its bad values in lora itself. try changing alpha or some other related parameters to adjust overal normalization inside lora.
I dont understand how it could be lora since i am using stable-diffusion-v1-5 as my base.
I dont understand how it could be lora since i am using stable-diffusion-v1-5 as my base.
doesn't matter what model you're using as base, lora can still have invalid values if its trained with bad params.
I'm also having this issue with SDXL 1.0, no Lora, on a 4090.
@flowerdealer which issue? I've already noted two different possible root causes with totally different paths. Please provide some additional information, just stating "me too" almost never helps.
Getting: Error is always : processing.py:732: RuntimeWarning: invalid value encountered in cast x_sample = x_sample.astype(np.uint8) and black images too.
My problem disappeared after setting the options: _If you intend to use the Diffusers backend and/or SDXL, go to Setup/User Interface and select sd_model_refiner, diffusers_pipeline, and sdbackend to the Quicksettings, this will make controlling SDXL much easier.
So, after playing around a bit, i can get it to work with all the bells and whistles with the following settings:
precision type - full
fp32
-no-half & --no-half-vae
use fixed unet precision
disable nan check
Euler a sampler
i did not do anything to the backend settings as pilat66 wrote. I can use extensions without problems with these settings. Tested with controlnet and afterdetailer
It seems the main cause of the issue with images breaking at some point is the sampler selected. With DDIM, UniPC, PLMS and all 4 karras the image breaks at some point. some sooner then later. Best performance by a mile is DDIM, so its sad having to switch back to euler. But i managed to be able to consistently generate images again without having to restart the server/pc every 10 minutes.
that indicates that issue is exactly as expected - its a math rounding issue resulting in invalid pixel values.
btw, i've added addtional exception handler that will try to replace invalid values with dead pixels (note that is not a solution, simply a stop-gap to prevent runtime errors).
anyone had a chance to update & check?
Tried it, error happened on the very first generation :D
C:\Users\xxxxxxxx\Desktop\AI Gen\stable-diffusion-webui-automatic\modules\processing.py:584: RuntimeWarning: invalid value encountered in cast
sample = sample.astype(np.uint8)
same settings still as i have posted earlier, nothing changed setup wise. always up to date. Your effort is appreciated btw, not sure you are getting the gratitude you deserve as a dev :)
thanks for the comments!
i've just pushed another update - failed to realize that numpy throws warning, not error.
Nope, same still
C:\Users\ARCHER\Desktop\AI Gen\stable-diffusion-webui-automatic\modules\processing.py:584: RuntimeWarning: invalid value encountered in cast
sample = sample.astype(np.uint8)
app: SD.next
updated: 2023-08-26
hash: 0f10a9ce
url: https://github.com/vladmandic/automatic/tree/master
OK, I'll have to rethink this a bit next week.
Enabling the option at Settings/Compute Settings: Enable model compile (experimental) is what made the diference here.
Small update from my side:
Samplers that cause the error: UniPC, DDIM, PLMS, any DPM Karras sampler up to 2M Samplers that work without error: Euler a, DPM++ 3M SDE Karras, DPM fast, LMS Karras
So if anyone else is having these issues, use any of these samplers, and if it helps here are my settings with which i can generate without errors:
Doesnt really solve it, but hey, if it works it works :D
using DPM++ 3M SDE Karras atm with always 65+ steps and having fairly good results in terms of speed and quality.
I'm using DPM SDE Karras and currently having this very exact problem, it was perfectly fine before updating to the latest version. Unrelated, but I'm also unable to use xFormers in the latest version.
Edit: Seems like it's much more frequent when using Euler A / default. DPM SDE only breaks every 12-20 images for me, here's what broken images may look like (using RTX 40 series cards if that helps):
in case it's helpful, I can reproduce this issue reliably by: 1/ creating a controlnet and generating an image with a 1.5 model (Model A) 2/ changing to any other 1.5 model (Model B) 3/generating an image with Model B 4/ observe the black squares + log error sd_webui_forge\webui_forge_cu121_torch21\webui\modules\processing.py:968: RuntimeWarning: invalid value encountered in cast███████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:02<00:00, 8.10it/s] x_sample = x_sample.astype(np.uint8)
the issue persists until I deactivate the controlnet, switch to any other 1.5 model (Model C), generate an image without controlnet, then return to Model B and set up the controlnet again. At that point I can run the exact same settings as step #3 above without issues.
Sometimes I get the black images issue randomly as well, but when I do I can always resolve it the same way.
This is also on Windows 11, latest sd web ui forge.
latest sd web ui forge.
This is a different app.
ah, my fault for getting confused. hopefully it's maybe still helpful knowledge.
我在processing.py:733中遇到了基本相同的问题,并且出现了同样的错误,如果在我的Nvidia 3060 12gb、Windows 10、Opera Gx、Intel i7-11700上用Kohya_ss训练一个新的lora,我的所有图像都是黑色的不用找了。
这个问题你解决了吗
english only pls
I'm using DPM SDE Karras and currently having this very exact problem, it was perfectly fine before updating to the latest version. Unrelated, but I'm also unable to use xFormers in the latest version.
Edit: Seems like it's much more frequent when using Euler A / default. DPM SDE only breaks every 12-20 images for me, here's what broken images may look like (using RTX 40 series cards if that helps):
I don't know if this helps, but I can realiably reproduce this (like in the quoted post) every 12 to 20 images using a RTX 4070 SUPER and a RTX 4090, the first one is from GainWard and the second one is from ASUS ROG Strix. The GPUs are placed in different systems, one of each has a i7-14700K and the other has a i9-14900K. Hope this helps into diagnosing the problem. It's not a problem with the systems because I've been using ComfyUI on both of them and there has never been a glitch.
只有英文请
Okay, I want to know if this issue has been resolved. I don't think it's a problem with your code. I've tried many SD UIs and this problem occurs. At first, it shows a black image after rendering the image. I added many parameters but it didn't work, so I plan to try a different system. Now it always prompts me:/modules/processing. py: 968: RuntimeWarning: invalid value encoded in cast X_sample=x_sample. style (np.uint8) and display black image
Issue Description
When running any prompt with any model and using the --safe flag at some point the generation breaks and just displays a black image. It will run for a few steps then either turn the preview black and the final image is just black. When restarting the server it can work again for 1 or 2 generations then it breaks again.
Error is always :
processing.py:732: RuntimeWarning: invalid value encountered in cast x_sample = x_sample.astype(np.uint8)
fiddling with settings, didnt help. no half, fp16 or 32, tried most of em. same with the different options like Use fixed UNet precision, Disable NaN check in produced images/latent spaces, Attempt VAE roll back when produced NaN values (experimental) etc.
Version Platform Description
Win10, Firefox, AMD Radeon RX 7900 XT, AMD Ryzen 7 5800X 8-Core Processor
Startup Log
Relevant log output
Acknowledgements