Closed imD-5 closed 9 months ago
Can you run `python3 -m torch.utils.collect_env'?
I can't reproduce it.
the environment settings i got are as follows:
Collecting environment information... PyTorch version: 2.1.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.6 LTS (x86_64) GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 Clang version: Could not collect CMake version: version 3.16.3 Libc version: glibc-2.31
Python version: 3.10.9 | packaged by conda-forge | (main, Feb 2 2023, 20:20:04) [GCC 11.3.0] (64-bit runtime) Python platform: Linux-5.15.0-1049-aws-x86_64-with-glibc2.31 Is CUDA available: True CUDA runtime version: 12.1.105 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: NVIDIA A10G Nvidia driver version: 535.104.12 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True
CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 48 bits physical, 48 bits virtual CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7R32 Stepping: 0 CPU MHz: 2799.998 BogoMIPS: 5599.99 Hypervisor vendor: KVM Virtualization type: full L1d cache: 128 KiB L1i cache: 128 KiB L2 cache: 2 MiB L3 cache: 16 MiB NUMA node0 CPU(s): 0-7 Vulnerability Gather data sampling: Not affected Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Mitigation; untrained return thunk; SMT enabled with STIBP protection Vulnerability Spec rstack overflow: Mitigation; safe RET Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, STIBP always-on, RSB filling, PBRSB-eIBRS Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch topoext ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr rdpru wbnoinvd arat npt nrip_save rdpid
Versions of relevant libraries: [pip3] numpy==1.26.2 [pip3] stable-fast==0.0.11+torch210cu121 [pip3] torch==2.1.0+cu121 [pip3] torchaudio==2.1.0+cu121 [pip3] torchsde==0.2.6 [pip3] torchvision==0.16.0+cu121 [pip3] triton==2.1.0 [conda] numpy 1.26.2 pypi_0 pypi [conda] stable-fast 0.0.11+torch210cu121 pypi_0 pypi [conda] torch 2.1.0+cu121 pypi_0 pypi [conda] torchaudio 2.1.0+cu121 pypi_0 pypi [conda] torchsde 0.2.6 pypi_0 pypi [conda] torchvision 0.16.0+cu121 pypi_0 pypi [conda] triton 2.1.0 pypi_0 pypi
Looks like your environment is OK. Can you install the latest stable-fast and retry? Also can you share the model you use?
thanks for taking your time on this ! i have downloaded and installed the latest version, and i still get the same error. "stable_fast-0.0.12.post3+torch210cu121-cp310-cp310-manylinux2014_x86_64.whl" the model i have tried is the two below. Do fp16models not work? with the nature of your optimization i dont think the data type makes a difference though. https://huggingface.co/gsdf/Counterfeit-V3.0/blob/main/Counterfeit-V3.0_fix_fp16.safetensors https://huggingface.co/Lykon/AnyLoRA/blob/main/AnyLoRA_noVae_fp16.safetensors
To give more context, it might be a problem originating from directly using the node classes in python code for execution. For this, i used the extension "https://github.com/pydn/ComfyUI-to-Python-Extension.git" as baseline for building my code. From an optimization point of view, it reduces the additional overhead caused by the setup on internal server, but still preserves the customizability and ease of use that comfyui provides. For example, the usage of stable fast in my implementation looks like this. are there any other imports or arguments I have to add to make this work?
applystablefastunet = NODE_CLASS_MAPPINGS["ApplyStableFastUnet"]() applystablefastunet_80 = applystablefastunet.apply_stable_fast( enable_cuda_graph=True, model=get_value_at_index(loraloader_58, 0) )
I can't figure it out😥
Help is needed.
ok after a lot of trying I solved the problem. it was the checkpoint loader i was using. the stable fast does not work with "checkpointLoader" node, but only works with the "CheckpointLoaderSimple" node.
ok after a lot of trying I solved the problem. it was the checkpoint loader i was using. the stable fast does not work with "checkpointLoader" node, but only works with the "CheckpointLoaderSimple" node.
ComfyUI is really complex. best experience is with pure huggingface's diffusers
.
You can use checkpointLoader
node with a configuration file with property model.params.unet_config.params.use_checkpoint
set to False.
@imD-5 @gameltb checkpoint feature in pytorch could be incompatible with many optimization solutions
yeah i thought that as well so i opted to use diffuses format for my project. i eventually got it down to 3sec/gen of 1024x1024 images less than 1 sec for 514! i appreceate you work very much thanks!
hi, i was trying this out for maximum optimization in aws G5 instance on ubuntu (it's just an nvidia A10g) and i was using comfy ui by calling on the nodes itself in python code and i kept getting this error message that i coudn't solve. how would i be able to resolve this? all the dependencies regarding the pytorch and 'diffusers>=0.19.3' 'xformers>=0.0.20' 'triton>=2.1.0' 'torch>=1.12.0' was met and it worked on my desktop but does not in ubuntu.
/opt/conda/lib/python3.10/site-packages/sfast/utils/flat_tensors.py:157: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! obj_type = tensors[start].item() /opt/conda/lib/python3.10/site-packages/sfast/utils/flat_tensors.py:216: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! size = tensors[start].item() /opt/conda/lib/python3.10/site-packages/sfast/utils/flat_tensors.py:226: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! size = tensors[start].item() /opt/conda/lib/python3.10/site-packages/sfast/utils/flat_tensors.py:212: TracerWarning: Converting a tensor to a Python list might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! return bytes(tensors[start].tolist()), start + 1 /opt/conda/lib/python3.10/site-packages/sfast/utils/flat_tensors.py:203: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! return int(tensors[start].item()), start + 1 0%| | 0/12 [00:01<?, ?it/s] Traceback (most recent call last): File "/home/ubuntu/test/workflow_clip_sdxl2.py", line 320, in
main()
File "/home/ubuntu/test/workflow_clip_sdxl2.py", line 229, in main
ksampler_3 = ksampler.sample(
File "/home/ubuntu/ComfyUI/nodes.py", line 1286, in sample
return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
File "/home/ubuntu/ComfyUI/nodes.py", line 1256, in common_ksampler
samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
File "/home/ubuntu/ComfyUI/custom_nodes/ComfyUI-Impact-Pack/modules/impact/sample_error_enhancer.py", line 22, in informative_sample
raise e
File "/home/ubuntu/ComfyUI/custom_nodes/ComfyUI-Impact-Pack/modules/impact/sample_error_enhancer.py", line 9, in informative_sample
return original_sample(args, kwargs)
File "/home/ubuntu/ComfyUI/comfy/sample.py", line 100, in sample
samples = sampler.sample(noise, positive_copy, negative_copy, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
File "/home/ubuntu/ComfyUI/comfy/samplers.py", line 711, in sample
return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
File "/home/ubuntu/ComfyUI/comfy/samplers.py", line 617, in sample
samples = sampler.sample(model_wrap, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
File "/home/ubuntu/ComfyUI/comfy/samplers.py", line 556, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, self.extra_options)
File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(args, kwargs)
File "/home/ubuntu/ComfyUI/comfy/k_diffusion/sampling.py", line 137, in sample_euler
denoised = model(x, sigma_hat * s_in, *extra_args)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, kwargs)
File "/home/ubuntu/ComfyUI/comfy/samplers.py", line 277, in forward
out = self.inner_model(x, sigma, cond=cond, uncond=uncond, cond_scale=cond_scale, model_options=model_options, seed=seed)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(args, kwargs)
File "/home/ubuntu/ComfyUI/comfy/samplers.py", line 267, in forward
return self.apply_model(args, kwargs)
File "/home/ubuntu/ComfyUI/comfy/samplers.py", line 264, in apply_model
out = sampling_function(self.inner_model, x, timestep, uncond, cond, cond_scale, model_options=model_options, seed=seed)
File "/home/ubuntu/ComfyUI/comfy/samplers.py", line 252, in sampling_function
cond, uncond = calc_cond_uncond_batch(model, cond, uncond, x, timestep, model_options)
File "/home/ubuntu/ComfyUI/comfy/samplers.py", line 228, in calc_cond_uncond_batch
output = model_options['model_function_wrapper'](model.apply_model, {"input": inputx, "timestep": timestep, "c": c, "cond_or_uncond": cond_or_uncond}).chunk(batch_chunks)
File "/home/ubuntu/ComfyUI/custom_nodes/ComfyUI_stable_fast/node.py", line 69, in call
return self.stable_fast_model.get_traced_module(inputx, timestep, c)[0](
File "/home/ubuntu/ComfyUI/custom_nodes/ComfyUI_stable_fast/module/stable_diffusion_pipeline_compiler.py", line 62, in get_traced_module
traced_m, call_helper = trace_with_kwargs(
File "/opt/conda/lib/python3.10/site-packages/sfast/jit/trace_helper.py", line 23, in trace_with_kwargs
traced_module = better_trace(TraceablePosArgOnlyModuleWrapper(func),
File "/opt/conda/lib/python3.10/site-packages/sfast/jit/utils.py", line 29, in better_trace
script_module = torch.jit.trace(func, args, kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/jit/_trace.py", line 798, in trace
return trace_module(
File "/opt/conda/lib/python3.10/site-packages/torch/jit/_trace.py", line 1065, in trace_module
module._c._create_method_from_trace(
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(args, kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1508, in _slow_forward
result = self.forward(*input, kwargs)
File "/opt/conda/lib/python3.10/site-packages/sfast/jit/trace_helper.py", line 127, in forward
outputs = self.module(*orig_args, *orig_kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1508, in _slow_forward
result = self.forward(*input, *kwargs)
File "/opt/conda/lib/python3.10/site-packages/sfast/jit/trace_helper.py", line 77, in forward
return self.func(args, kwargs)
File "/home/ubuntu/ComfyUI/comfy/model_base.py", line 68, in apply_model
model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, extra_conds).float()
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(args, kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1508, in _slow_forward
result = self.forward(*input, kwargs)
File "/home/ubuntu/ComfyUI/comfy/ldm/modules/diffusionmodules/openaimodel.py", line 619, in forward
h = forward_timestep_embed(module, h, emb, context, transformer_options)
File "/home/ubuntu/ComfyUI/comfy/ldm/modules/diffusionmodules/openaimodel.py", line 35, in forward_timestep_embed
x = layer(x, emb)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(args, kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1508, in _slow_forward
result = self.forward(*input, kwargs)
File "/home/ubuntu/ComfyUI/comfy/ldm/modules/diffusionmodules/openaimodel.py", line 210, in forward
return checkpoint(
File "/home/ubuntu/ComfyUI/comfy/ldm/modules/diffusionmodules/util.py", line 121, in checkpoint
return CheckpointFunction.apply(func, len(inputs), args)
File "/opt/conda/lib/python3.10/site-packages/torch/autograd/function.py", line 539, in apply
return super().apply(args, kwargs) # type: ignore[misc]
RuntimeError: _Map_base::at