[ERROR] RuntimeError: CUDA error: the launch timed out and was terminated (Driving me nuts)

PurpleBlueAloeVera commented 4 months ago

Ok, so here's the new bug that I'm getting and that I can't for the life of me fix. I reinstalled cuda, reinstalled ComfyUI.. updated my drivers (which are always up-to-date).. I just do NOT understand this error. It's driving me crazy.. can somebody help please..


!!! Exception during processing!!! CUDA error: the launch timed out and was terminated
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Traceback (most recent call last):
  File "F:\AI_repos\ComfyUI\execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "F:\AI_repos\ComfyUI\execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "F:\AI_repos\ComfyUI\custom_nodes\ComfyUI_ezXY\autoCastPatch.py", line 299, in map_node_over_list
    return _map_node_over_list(obj, input_data_all, func, allow_interrupt)
  File "F:\AI_repos\ComfyUI\execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "F:\AI_repos\ComfyUI\nodes.py", line 1344, in sample
    return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
  File "F:\AI_repos\ComfyUI\nodes.py", line 1314, in common_ksampler
    samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
  File "F:\AI_repos\ComfyUI\custom_nodes\ComfyUI-Impact-Pack\modules\impact\sample_error_enhancer.py", line 22, in informative_sample
    raise e
  File "F:\AI_repos\ComfyUI\custom_nodes\ComfyUI-Impact-Pack\modules\impact\sample_error_enhancer.py", line 9, in informative_sample
    return original_sample(*args, **kwargs)  # This code helps interpret error messages that occur within exceptions but does not have any impact on other operations.
  File "F:\AI_repos\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\sampling.py", line 279, in motion_sample
    return orig_comfy_sample(model, noise, *args, **kwargs)
  File "F:\AI_repos\ComfyUI\custom_nodes\ComfyUI-Advanced-ControlNet\adv_control\control_reference.py", line 47, in refcn_sample
    return orig_comfy_sample(model, *args, **kwargs)
  File "F:\AI_repos\ComfyUI\comfy\sample.py", line 37, in sample
    samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
  File "F:\AI_repos\ComfyUI\comfy\samplers.py", line 755, in sample
    return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
  File "F:\AI_repos\ComfyUI\comfy\samplers.py", line 657, in sample
    return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
  File "F:\AI_repos\ComfyUI\comfy\samplers.py", line 644, in sample
    output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
  File "F:\AI_repos\ComfyUI\comfy\samplers.py", line 623, in inner_sample
    samples = sampler.sample(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
  File "F:\AI_repos\ComfyUI\comfy\samplers.py", line 534, in sample
    samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
  File "F:\AI_repos\stable-diffusion-webui-forge\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "F:\AI_repos\ComfyUI\comfy\k_diffusion\sampling.py", line 707, in sample_dpmpp_sde_gpu
    return sample_dpmpp_sde(model, x, sigmas, extra_args=extra_args, callback=callback, disable=disable, eta=eta, s_noise=s_noise, noise_sampler=noise_sampler, r=r)
  File "F:\AI_repos\stable-diffusion-webui-forge\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "F:\AI_repos\ComfyUI\comfy\k_diffusion\sampling.py", line 559, in sample_dpmpp_sde
    denoised_2 = model(x_2, sigma_fn(s) * s_in, **extra_args)
  File "F:\AI_repos\ComfyUI\comfy\samplers.py", line 272, in __call__
    out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
  File "F:\AI_repos\ComfyUI\comfy\samplers.py", line 610, in __call__
    return self.predict_noise(*args, **kwargs)
  File "F:\AI_repos\ComfyUI\comfy\samplers.py", line 613, in predict_noise
    return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
  File "F:\AI_repos\ComfyUI\comfy\samplers.py", line 258, in sampling_function
    out = calc_cond_batch(model, conds, x, timestep, model_options)
  File "F:\AI_repos\ComfyUI\custom_nodes\ComfyUI-TiledDiffusion\.patches.py", line 4, in calc_cond_batch
    return calc_cond_batch_original_tiled_diffusion_d5cd7809(model, conds, x_in, timestep, model_options)
  File "F:\AI_repos\ComfyUI\comfy\samplers.py", line 218, in calc_cond_batch
    output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
  File "F:\AI_repos\ComfyUI\comfy\model_base.py", line 97, in apply_model
    model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
  File "F:\AI_repos\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "F:\AI_repos\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "F:\AI_repos\ComfyUI\comfy\ldm\modules\diffusionmodules\openaimodel.py", line 885, in forward
    h = forward_timestep_embed(module, h, emb, context, transformer_options, output_shape, time_context=time_context, num_video_frames=num_video_frames, image_only_indicator=image_only_indicator)
  File "F:\AI_repos\ComfyUI\comfy\ldm\modules\diffusionmodules\openaimodel.py", line 44, in forward_timestep_embed
    x = layer(x, context, transformer_options)
  File "F:\AI_repos\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "F:\AI_repos\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "F:\AI_repos\ComfyUI\comfy\ldm\modules\attention.py", line 633, in forward
    x = block(x, context=context[i], transformer_options=transformer_options)
  File "F:\AI_repos\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "F:\AI_repos\stable-diffusion-webui-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "F:\AI_repos\ComfyUI\comfy\ldm\modules\attention.py", line 460, in forward
    return checkpoint(self._forward, (x, context, transformer_options), self.parameters(), self.checkpoint)
  File "F:\AI_repos\ComfyUI\comfy\ldm\modules\diffusionmodules\util.py", line 191, in checkpoint
    return func(*inputs)
  File "F:\AI_repos\ComfyUI\comfy\ldm\modules\attention.py", line 557, in _forward
    n = attn2_replace_patch[block_attn2](n, context_attn2, value_attn2, extra_options)
  File "F:\AI_repos\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\CrossAttentionPatch.py", line 43, in __call__
    sigma = extra_options["sigmas"].detach().cpu()[0].item() if 'sigmas' in extra_options else 999999999.9
RuntimeError: CUDA error: the launch timed out and was terminated
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Prompt executed in 51.20 seconds
Exception in thread Thread-22 (prompt_worker):
Traceback (most recent call last):
  File "D:\python\lib\threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "D:\python\lib\threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "F:\AI_repos\ComfyUI\main.py", line 144, in prompt_worker
    comfy.model_management.soft_empty_cache()
  File "F:\AI_repos\ComfyUI\comfy\model_management.py", line 840, in soft_empty_cache
    torch.cuda.empty_cache()
  File "F:\AI_repos\stable-diffusion-webui-forge\venv\lib\site-packages\torch\cuda\memory.py", line 159, in empty_cache
    torch._C._cuda_emptyCache()
RuntimeError: CUDA error: the launch timed out and was terminated
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. ```

If anyone wants to comment on my venv. Yes, I intially was using a fresh ComfyUI venv, but still same issue. Here I tried to link an existing venv. But didn't help either..

Declan-Bracken commented 4 months ago

I'm having the same issue when trying to run inference on a quantized Mistral model. Drivers and pytorch version all up to date.

ghost commented 4 months ago

I have the same issue too. It drives me nuts every single time I got this error.

Here's what I did in ComfyUI from Stability Matrix:

I enabled --lowvram in the ComfyUI settings, and I get the same error.
I set the image latent to 768*768, but I still get the same thing.

All I have in my Stability Matrix software for ComfyUI is the Impact & Inspire Pack, along with the Prompt Control.

Personally, I downloaded both EasyFluff V11.2 safetensors and yaml files, as well as the sampler_rescalecfg.py.

In the end, the console tells me to set CUDA_LAUNCH_BLOCKING=1 before I can make AI arts. But where can I set CUDA_LAUNCH_BLOCKING=1 somewhere in ComfyUI?

comfyanonymous / ComfyUI

[ERROR] RuntimeError: CUDA error: the launch timed out and was terminated (Driving me nuts) #3380