TheLastBen / fast-stable-diffusion

fast-stable-diffusion + DreamBooth
MIT License
7.53k stars 1.31k forks source link

weird errors on colab #248

Open hundkillen opened 2 years ago

hundkillen commented 2 years ago

can someone help e figure this out? since the updates things are acting really weird sometimes i can generate 1-2 images before the errors begin, other times it begins at once.. and if i get the error i have to rerun every cell again. and this is really annoying, since i just bought the colab pro+ to use SD for a project. and now nothing is working..

sometimes it doesnt even help to rerun the cells. the only choice i have is to delete the SD folder and start from scratch again the errors in getting is this: Traceback (most recent call last): File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/ui.py", line 221, in f res = list(func(*args, kwargs)) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/webui.py", line 63, in f res = func(args, kwargs) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/txt2img.py", line 48, in txt2img processed = process_images(p) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/processing.py", line 359, in process_images res = process_images_inner(p) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/processing.py", line 452, in process_images_inner samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/processing.py", line 600, in sample samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.create_dummy_mask(x)) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/sd_samplers.py", line 463, in sample samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={ File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/sd_samplers.py", line 365, in launch_sampling return func() File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/sd_samplers.py", line 468, in }, disable=False, callback=self.callback_state, extra_params_kwargs)) File "/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "/content/gdrive/MyDrive/sd/stable-diffusion/src/k-diffusion/k_diffusion/sampling.py", line 173, in sample_dpm_2_ancestral denoised_2 = model(x_2, sigma_mid * s_in, extra_args) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/sd_samplers.py", line 298, in forward x_out[a:b] = self.inner_model(x_in[a:b], sigma_in[a:b], cond={"c_crossattn": [tensor[a:b]], "c_concat": [image_cond_in[a:b]]}) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/content/gdrive/MyDrive/sd/stable-diffusion/src/k-diffusion/k_diffusion/external.py", line 112, in forward eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), kwargs) File "/content/gdrive/MyDrive/sd/stable-diffusion/src/k-diffusion/k_diffusion/external.py", line 138, in get_eps return self.inner_model.apply_model(args, kwargs) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/ldm/models/diffusion/ddpm.py", line 987, in apply_model x_recon = self.model(x_noisy, t, cond) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/ldm/models/diffusion/ddpm.py", line 1410, in forward out = self.diffusion_model(x, t, context=cc) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, kwargs) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/ldm/modules/diffusionmodules/openaimodel.py", line 732, in forward h = module(h, emb, context) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/ldm/modules/diffusionmodules/openaimodel.py", line 85, in forward x = layer(x, context) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/ldm/modules/attention.py", line 418, in forward x = block(x, context=context) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, kwargs) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/ldm/modules/attention.py", line 254, in forward hidden_states = self.ff(self.norm3(hidden_states)) + hidden_states File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/ldm/modules/attention.py", line 71, in forward return self.net(x) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py", line 139, in forward input = module(input) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP)

Interrupted with signal 2 in <frame at 0x7f49da602e50, file '/content/gdrive/MyDrive/sd/stable-diffusion-webui/webui.py', line 110, code wait_on_server> Traceback (most recent call last): File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/webui.py", line 171, in webui() File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/webui.py", line 152, in webui wait_on_server(demo) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/webui.py", line 110, in wait_on_server time.sleep(0.5) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/webui.py", line 97, in sigint_handler print(f'Interrupted with signal {sig} in {frame}') File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/webui.py", line 97, in sigint_handler print(f'Interrupted with signal {sig} in {frame}') RuntimeError: reentrant call inside <_io.BufferedWriter name=''> Total progress: 3% 1/36 [00:11<06:26, 11.03s/it]

TheLastBen commented 2 years ago

Dreambooth colab ?

hundkillen commented 2 years ago

nope fast_stable_diffusion_AUTOMATIC1111.ipynb

TheLastBen commented 2 years ago

is your GPU A100 ? run !nvidia-smi to check

hundkillen commented 2 years ago

i have Colab Pro+ with GPU class premium

+-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 A100-SXM4-40GB Off | 00000000:00:04.0 Off | 0 | | N/A 33C P0 43W / 400W | 0MiB / 40536MiB | 0% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+

TheLastBen commented 2 years ago

yep it's the A100, i'll fix it soon, try getting another colab GPU in the meantime

hundkillen commented 2 years ago

ok, so at the moment im better of not using premium?

TheLastBen commented 2 years ago

Yes, the T4 should be enough

hundkillen commented 2 years ago

cool thanks

eliohead commented 1 year ago

Made this error again for me today.