kijai / ComfyUI-SUPIR

SUPIR upscaling wrapper for ComfyUI
Other
1.49k stars 82 forks source link

Crash when using SUPIR Model Loader v2 #87

Closed ArkhamInsanity closed 6 months ago

ArkhamInsanity commented 6 months ago

Hello,

System : Ubuntu 22.04.4 LTS AMD 6800 XT (16 Gb VRAM) 32Gb Ram Python 3.10.12

Problem : When I load my Supir model and my SDXL model, Comfyui crashes at the SDXL loading step. I use Q model and SDXL base model or JuggernautXL and the most basic workflow (no upscale, just the supir node for the first stage, and sampler) on 512*512 images, and nothing running on the background. I have no problem with the legacy node, so I stick to it for now. What could be the issue?

Thanks for the hard work, Supir is amazing :)

kijai commented 6 months ago

This has not happened to me yet, so I'm gonna need the error log to figure this one out.

ArkhamInsanity commented 6 months ago

[2024-03-24 20:48] got prompt [2024-03-24 20:48] [rgthree] Using rgthree's optimized recursive execution. [2024-03-24 20:48] model_type EPS [2024-03-24 20:48] Using split attention in VAE [2024-03-24 20:48] Using split attention in VAE [2024-03-24 20:49] clip missing: ['clip_l.logit_scale', 'clip_l.transformer.text_projection.weight'] [2024-03-24 20:49] loaded straight to GPU [2024-03-24 20:49] Requested to load SDXL [2024-03-24 20:49] Loading 1 new model [2024-03-24 20:49] Diffusion using fp16 [2024-03-24 20:49] making attention of type 'vanilla' with 512 in_channels [2024-03-24 20:49] Working with z of shape (1, 4, 32, 32) = 4096 dimensions. [2024-03-24 20:49] making attention of type 'vanilla' with 512 in_channels [2024-03-24 20:49] Attempting to load SUPIR model: [/media/blayo/74b2f6fa-be5c-452e-8cee-4e9d28a57cd9/yohan/ComfyUI/models/checkpoints/Supir/SUPIR-v0Q.ckpt] [2024-03-24 20:49] Loaded state_dict from [/media/blayo/74b2f6fa-be5c-452e-8cee-4e9d28a57cd9/yohan/ComfyUI/models/checkpoints/Supir/SUPIR-v0Q.ckpt] [2024-03-24 20:49] Attempting to load SDXL model from node inputs [2024-03-24 20:49] Requested to load SDXLClipModel [2024-03-24 20:49] Loading 1 new model [2024-03-24 20:49] Requested to load SDXL [2024-03-24 20:49] Loading 1 new model

kijai commented 6 months ago

So no error just crash? That would suggest running out of system RAM. I updated the node a bit today for smarter loading which could help.

ArkhamInsanity commented 6 months ago

Exactly Tried a bit with a lighting model, it seemed to work with the new node, and if I keep the legacy node, no crash I will update the node tomorrow and tell you if it still crashes or not

kijai commented 6 months ago

I too crash on the SUPIR model loader; both legacy and v2 loader. Fresh install of Comfy (I believe the logs say it's release is 24 Mar 24, which is today. Might that be an issue?)

ComfyUI startup time: 2024-03-25 00:33:00.354442 Platform: Linux Python version: 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:36:39) [GCC 12.3.0] Python executable: /opt/conda/bin/python

ComfyUI Revision: 2082 [c6de09b0] | Released on '2024-03-24'

comfyui.log

Error occurred when executing SUPIR_model_loader_v2:

Expected a cuda device, but got: cpu

File "/kaggle/working/ComfyUI/execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) File "/kaggle/working/ComfyUI/execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) File "/kaggle/working/ComfyUI/execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(slice_dict(input_data_all, i))) File "/kaggle/working/ComfyUI/custom_nodes/ComfyUI-SUPIR/nodes_v2.py", line 802, in process self.model = instantiate_from_config(config.model).cpu() File "/kaggle/working/ComfyUI/custom_nodes/ComfyUI-SUPIR/sgm/util.py", line 175, in instantiate_from_config return get_obj_from_str(config["target"])(config.get("params", dict())) File "/kaggle/working/ComfyUI/custom_nodes/ComfyUI-SUPIR/sgm/util.py", line 187, in get_obj_from_str return getattr(importlib.import_module(module, package=package_directory_name), cls) File "/opt/conda/lib/python3.10/importlib/init.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1050, in _gcd_import File "", line 1027, in _find_and_load File "", line 1006, in _find_and_load_unlocked File "", line 688, in _load_unlocked File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "/kaggle/working/ComfyUI/custom_nodes/ComfyUI-SUPIR/SUPIR/models/SUPIR_model_v2.py", line 9, in from ...SUPIR.utils.tilevae import VAEHook File "/kaggle/working/ComfyUI/custom_nodes/ComfyUI-SUPIR/SUPIR/utils/tilevae.py", line 128, in DEFAULT_ENCODER_TILE_SIZE = get_recommend_encoder_tile_size() File "/kaggle/working/ComfyUI/custom_nodes/ComfyUI-SUPIR/SUPIR/utils/tilevae.py", line 88, in get_recommend_encoder_tile_size total_memory = torch.cuda.get_device_properties( File "/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py", line 450, in get_device_properties device = _get_device_index(device, optional=True) File "/opt/conda/lib/python3.10/site-packages/torch/cuda/_utils.py", line 35, in _get_device_index raise ValueError(f"Expected a cuda device, but got: {device}")

This is different error, and I have no idea what could cause this... what device are you using exactly?

ArkhamInsanity commented 6 months ago

So no error just crash? That would suggest running out of system RAM. I updated the node a bit today for smarter loading which could help.

Yep, did the update and tried 1024*1024 with sdxl base model, doing denoiser then sampler. It works now! Will try later other checkpoints, upscaling etc but now I can use the v2 loader, thank you !