comfyanonymous / ComfyUI_bitsandbytes_NF4

GNU Affero General Public License v3.0
275 stars 19 forks source link

'ForgeParams4bit' object has no attribute 'module' #12

Open caustiq opened 1 month ago

caustiq commented 1 month ago
Error occurred when executing KSampler:

'ForgeParams4bit' object has no attribute 'module'

File "/home/user/Ai/ComfyUI/execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "/home/user/Ai/ComfyUI/execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "/home/user/Ai/ComfyUI/execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "/home/user/Ai/ComfyUI/nodes.py", line 1382, in sample
return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
File "/home/user/Ai/ComfyUI/nodes.py", line 1352, in common_ksampler
samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
File "/home/user/Ai/ComfyUI/comfy/sample.py", line 43, in sample
samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
File "/home/user/Ai/ComfyUI/comfy/samplers.py", line 829, in sample
return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
File "/home/user/Ai/ComfyUI/comfy/samplers.py", line 729, in sample
return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
File "/home/user/Ai/ComfyUI/comfy/samplers.py", line 706, in sample
self.inner_model, self.conds, self.loaded_models = comfy.sampler_helpers.prepare_sampling(self.model_patcher, noise.shape, self.conds)
File "/home/user/Ai/ComfyUI/comfy/sampler_helpers.py", line 66, in prepare_sampling
comfy.model_management.load_models_gpu([model] + models, memory_required=memory_required, minimum_memory_required=minimum_memory_required)
File "/home/user/Ai/ComfyUI/comfy/model_management.py", line 526, in load_models_gpu
cur_loaded_model = loaded_model.model_load(lowvram_model_memory, force_patch_weights=force_patch_weights)
File "/home/user/Ai/ComfyUI/comfy/model_management.py", line 323, in model_load
self.model.unpatch_model(self.model.offload_device)
File "/home/user/Ai/ComfyUI/comfy/model_patcher.py", line 614, in unpatch_model
self.model.to(device_to)
File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1340, in to
return self._apply(convert)
File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 900, in _apply
module._apply(fn)
File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 900, in _apply
module._apply(fn)
File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 927, in _apply
param_applied = fn(param)
File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1326, in convert
return t.to(
File "/home/user/Ai/ComfyUI/custom_nodes/ComfyUI_bitsandbytes_NF4/__init__.py", line 66, in to
module=self.module

using latest bitsandbytes 'multi-backend-refactor' branch that supports AMD GPU pytorch version: 2.5.0.dev20240811+rocm6.1 rocm 6.2 for AMD 6900XT

fgdfgfthgr-fox commented 1 month ago

Same issue here. I am also using an AMD gpu (Radeon VII) with multi-backend-refactor bitsandbytes.

caustiq commented 1 month ago

@fgdfgfthgr-fox which version of pytorch and rocm are you using? I'm about to try rocm 6.1 instead of 6.2

CambridgeComputing commented 1 month ago

I ran into the same error and upgrading bitandbytes fixed it for me: pip install bitsandbytes -U

caustiq commented 1 month ago

I ran into the same error and upgrading bitandbytes fixed it for me: pip install bitsandbytes -U

nah, this issue is AMD specific so we need to build manually from a different branch (mentioned earlier). That command would just install the version without AMD support.

I'm about to try rocm 6.1 instead of 6.2

I tried rocm 6.1 with pytorch 2.4.0 stable and ran into the same issue in the OP.

CambridgeComputing commented 1 month ago

I ran into the same error and upgrading bitandbytes fixed it for me: pip install bitsandbytes -U

nah, this issue is AMD specific so we need to build manually from a different branch (mentioned earlier). That command would just install the version without AMD support.

I'm about to try rocm 6.1 instead of 6.2

I tried rocm 6.1 with pytorch 2.4.0 stable and ran into the same issue in the OP.

Apologies, I didn't see the ROCm part of the original report, you are correct.

caustiq commented 1 month ago

I'm also getting #15 on startup and since it's related to 'types' and zope.interface I wonder if it's the root cause of this issue?

fgdfgfthgr-fox commented 1 month ago

@fgdfgfthgr-fox which version of pytorch and rocm are you using? I'm about to try rocm 6.1 instead of 6.2

I honestly don't know anymore because I just tried to upgrade rocm and now my computer won't boost with graphic card enabled and I have to solve that first. I guess that's what you get for using rocm. Edit: Ok I get it solved... Torch version is 2.5.0.dev20240811+rocm6.1, rocm is 6.1.

fgdfgfthgr-fox commented 1 month ago

I suspect it's simply because the multi-backend-refactor branch of bitsandbytes isn't up to date yet, all we can do is wait, or comfyanonymous could use another implementation.

initialxy commented 1 month ago

I can confirm that if you merge main onto multi-backend-refactor, resolve a couple of merge conflicts, which are pretty trivial. Then build and install from there, this error goes away and it will work. I have a RX 7900 XTX and rocm 6.1 on Arch Linux.

gabriel-filincowsky commented 1 month ago

Note: I am using an AMD CPU and a Nvidia GPU. Maybe the problem is not with the AMD GPU but with the AMD version of TORCH or some other resource.

I already tried to:

Hardware: CPU: AMD Ryzen 9 5900HX with Radeon Graphics - Arch: AMD64 - OS: Windows 10 GPU: NVIDIA GeForce RTX 3080 Laptop GPU NVIDIA Driver: 560.81 Total VRAM 16384 MB, total RAM 32175 MB pytorch version: 2.5.0.dev20240812+cu121 Set vram state to: NORMAL_VRAM Device: cuda:0 NVIDIA GeForce RTX 3080 Laptop GPU : cudaMallocAsync

!!! Exception during processing!!! 'ForgeParams4bit' object has no attribute 'module'
Traceback (most recent call last):
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\ComfyUI\comfy\model_management.py", line 321, in model_load
    self.real_model = self.model.patch_model(device_to=patch_model_to, patch_weights=load_weights)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\ComfyUI\comfy\model_patcher.py", line 352, in patch_model
    self.model.to(device_to)
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1340, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 900, in _apply
    module._apply(fn)
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 900, in _apply
    module._apply(fn)
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 927, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1326, in convert
    return t.to(
           ^^^^^
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\ComfyUI\custom_nodes\ComfyUI_bitsandbytes_NF4\__init__.py", line 66, in to
    module=self.module
           ^^^^^^^^^^^
AttributeError: 'ForgeParams4bit' object has no attribute 'module'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\ComfyUI\execution.py", line 152, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\ComfyUI\execution.py", line 82, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\ComfyUI\execution.py", line 75, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\ComfyUI\custom_nodes\ComfyUI_bitsandbytes_NF4\__init__.py", line 178, in load_checkpoint
    out = comfy.sd.load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, embedding_directory=folder_paths.get_folder_paths("embeddings"), model_options={"custom_operations": OPS})
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\ComfyUI\comfy\sd.py", line 511, in load_checkpoint_guess_config
    out = load_state_dict_guess_config(sd, output_vae, output_clip, output_clipvision, embedding_directory, output_model, model_options)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\ComfyUI\comfy\sd.py", line 588, in load_state_dict_guess_config
    model_management.load_model_gpu(model_patcher)
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\ComfyUI\comfy\model_management.py", line 540, in load_model_gpu
    return load_models_gpu([model])
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\ComfyUI\comfy\model_management.py", line 526, in load_models_gpu
    cur_loaded_model = loaded_model.model_load(lowvram_model_memory, force_patch_weights=force_patch_weights)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\ComfyUI\comfy\model_management.py", line 323, in model_load
    self.model.unpatch_model(self.model.offload_device)
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\ComfyUI\comfy\model_patcher.py", line 618, in unpatch_model
    self.model.to(device_to)
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1340, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 900, in _apply
    module._apply(fn)
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 900, in _apply
    module._apply(fn)
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 927, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1326, in convert
    return t.to(
           ^^^^^
  File "G:\_SD\Swarm\StableSwarmUI\dlbackend\comfy\ComfyUI\custom_nodes\ComfyUI_bitsandbytes_NF4\__init__.py", line 66, in to
    module=self.module
           ^^^^^^^^^^^
AttributeError: 'ForgeParams4bit' object has no attribute 'module'
initialxy commented 1 month ago

I can add that I found it needs rocm 6.0 or greater otherwise it will just crash at bitsandbytes when you validate it with python -m bitsandbytes

gabriel-filincowsky commented 1 month ago

Hi @initialxy,

I've spent considerable time trying to implement your suggestion with the ComfyUI_bitsandbytes_NF4 repository, but I ran into a roadblock. It seems there is no multi-backend-refactor branch in that repository—only master is available. You are probably talking about something else. Sorry, I'm a newbie here 🥺.

Could you please clarify if you're referring to a different repository or perhaps a fork? I would greatly appreciate any additional details to help me move forward.

Thanks for your guidance!

hartmark commented 1 month ago

I can confirm that if you merge main onto multi-backend-refactor, resolve a couple of merge conflicts, which are pretty trivial. Then build and install from there, this error goes away and it will work. I have a RX 7900 XTX and rocm 6.1 on Arch Linux.

Care to share a fork with a branch with main and multi-backend-refactor merged?

gabriel-filincowsky commented 1 month ago

I'm trying to brut force to learn how all this work. This is what I talked with gpt-4o. Please let me know if any of this makes sense.

I'm not sure, but I think that the ComfyUI_bitsandbytes_NF4 repository is only a small adaptation for bitsandbytes to run in the comfyui (like a plug-in).

I think that what that user was referring to is the original repository for bitsandbytes. When looking for it on GitHub I found the following repository.
https://github.com/bitsandbytes-foundation/bitsandbytes/tree/multi-backend-refactor

You can notice that in this repository we have the mentioned branch. Now, the question is how does this relates to comfyui? Would it be the case that bitsandbytes is installed in another directory as a requirement, maybe in the pythonembeded folder? Or could be the case the ComfyUI_bitsandbytes_NF4 repository was created using the main from https://github.com/bitsandbytes-foundation/bitsandbyte and would require to create a fork of ComfyUI_bitsandbytes_NF4 using the multi-backend-refactor branch????

What do you think about this? Can you please offer a better insight and direction?

I am asking it to you. I'm not trying to review the previous post text.
It sounds like you've uncovered an important piece of the puzzle. The `ComfyUI_bitsandbytes_NF4` repository is indeed likely a plugin or adaptation to get `bitsandbytes` working specifically within ComfyUI. The branch `multi-backend-refactor` you mentioned from the original `bitsandbytes` repository is indeed a major branch in the official repository for `bitsandbytes` (https://github.com/bitsandbytes-foundation/bitsandbytes).

Given this, here’s what could be happening:

1. **ComfyUI_bitsandbytes_NF4 as an Adapter**: The `ComfyUI_bitsandbytes_NF4` repository is a wrapper or plugin to make the `bitsandbytes` library compatible with ComfyUI. It likely pulls in `bitsandbytes` as a dependency.

2. **How It Relates to ComfyUI**:
    - **`bitsandbytes` in the Python Environment**: The actual `bitsandbytes` library is likely installed as a dependency within the `python_embeded` folder or another location where ComfyUI's Python environment manages packages.
    - **Updating or Modifying `bitsandbytes`**: To work with the `multi-backend-refactor` branch, you might need to manually clone the `bitsandbytes` repository from the official source (https://github.com/bitsandbytes-foundation/bitsandbytes/tree/multi-backend-refactor), build it, and replace the existing `bitsandbytes` installation in your environment.

3. **Potential Approaches**:
    - **Direct Update**: Clone the `multi-backend-refactor` branch from the official `bitsandbytes` repository, build it, and replace the installed `bitsandbytes` in your environment (`python_embeded`).
    - **Fork and Modify**: If the `ComfyUI_bitsandbytes_NF4` repository is tightly coupled with a specific branch from the original `bitsandbytes`, you may consider forking `ComfyUI_bitsandbytes_NF4`, updating it to work with `multi-backend-refactor`, and then integrating it back into ComfyUI.

### Steps to Proceed:

1. **Clone the `multi-backend-refactor` Branch**:
   ```bash
   git clone -b multi-backend-refactor https://github.com/bitsandbytes-foundation/bitsandbytes.git
   cd bitsandbytes

Build it according to the instructions provided in the repository.

  1. Replace Existing Installation: Identify where bitsandbytes is currently installed (likely in python_embeded/lib/site-packages or similar) and replace the existing files with your new build.

  2. Test: Run the setup and see if the issue is resolved within ComfyUI.

This approach should help integrate the updates from the multi-backend-refactor branch into your ComfyUI environment. If you encounter any specific issues during this process, further adjustments might be needed.

initialxy commented 1 month ago

@gabriel-filincowsky o I was referring to manually replacing the bitsandbytes dependency from its original repo https://github.com/bitsandbytes-foundation/bitsandbytes/tree/multi-backend-refactor

Even though I mentioned that merging main to multi-backend-refactor is pretty trivial, I don't feel comfortable submitting a PR since I'm not sure if maintainers prefer to merge multi-backend-refactor to main instead. I'd recommend just waiting for the official bitsandbytes repo to merge and release a new version.

In the meantime, if you really want to try, I forked it and performed a merge (though I will delete this repo when upstream is updated). https://github.com/initialxy/bitsandbytes/tree/multi-backend-refactor

# activate your local python venv
source venv/bin/activate
# uninstall the current installation of bitsandbytes
pip uninstall bitsandbytes
# clone the above
git clone https://github.com/initialxy/bitsandbytes.git
cd bitsandbytes
git checkout multi-backend-refactor
# build it, assuming you have all the build tools. I just followed AUR's PKGBUILD script https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=python-bitsandbytes-rocm-git
cmake -DCOMPUTE_BACKEND=hip -S .
make
# I believe `pip install .` will do the same, but I just copied from AUR
pip install build installer wheel
python -m build --wheel --no-isolation
python -m installer dist/*.whl
# check that it installed successfully
python -m bitsandbytes

and that worked for me. Again, I'd recommend just wait for the official repo to merge. Things are moving really fast right now. Also, reading your prior comment, you mentioned that you have an AMD and Nvidia GPUs, so I'm not super sure if this will work for you. This rocm build is only meant for AMD GPU.

hartmark commented 1 month ago

Thanks for the fork, I'll check it out when I get the time.

Regarding mixing amd and Nvidia. I don't know if it's supported because for Nvidia you use DCOMPUTE_BACKEND=cuda

hartmark commented 4 weeks ago

I tried using your branch @initialxy but I get this error now instead:

Error log Error occurred when executing KSampler: 'NoneType' object has no attribute 'cdequantize_blockwise_bf16_nf4' File "/comfyui/execution.py", line 152, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/execution.py", line 82, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/execution.py", line 75, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/nodes.py", line 1418, in sample return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/nodes.py", line 1385, in common_ksampler samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/custom_nodes/ComfyUI-Impact-Pack/modules/impact/sample_error_enhancer.py", line 9, in informative_sample return original_sample(*args, **kwargs) # This code helps interpret error messages that occur within exceptions but does not have any impact on other operations. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/comfy/sample.py", line 43, in sample samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/comfy/samplers.py", line 829, in sample return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/comfy/samplers.py", line 729, in sample return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/comfy/samplers.py", line 716, in sample output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/comfy/samplers.py", line 695, in inner_sample samples = sampler.sample(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/comfy/samplers.py", line 600, in sample samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/comfy/k_diffusion/sampling.py", line 143, in sample_euler denoised = model(x, sigma_hat * s_in, **extra_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/comfy/samplers.py", line 299, in __call__ out = self.inner_model(x, sigma, model_options=model_options, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/comfy/samplers.py", line 682, in __call__ return self.predict_noise(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/comfy/samplers.py", line 685, in predict_noise return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/comfy/samplers.py", line 279, in sampling_function out = calc_cond_batch(model, conds, x, timestep, model_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/comfy/samplers.py", line 228, in calc_cond_batch output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/comfy/model_base.py", line 145, in apply_model model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/comfy/ldm/flux/model.py", line 150, in forward out = self.forward_orig(img, img_ids, context, txt_ids, timestep, y, guidance, control) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/comfy/ldm/flux/model.py", line 104, in forward_orig img = self.img_in(img) ^^^^^^^^^^^^^^^^ File "/root/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/custom_nodes/ComfyUI_bitsandbytes_NF4/__init__.py", line 151, in forward return functional_linear_4bits(x, self.weight, self.bias) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/comfyui/custom_nodes/ComfyUI_bitsandbytes_NF4/__init__.py", line 15, in functional_linear_4bits out = bnb.matmul_4bit(x, weight.t(), bias=bias, quant_state=weight.quant_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/venv/lib/python3.12/site-packages/bitsandbytes/autograd/_functions.py", line 591, in matmul_4bit return MatMul4Bit.apply(A, B, out, bias, quant_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/venv/lib/python3.12/site-packages/torch/autograd/function.py", line 575, in apply return super().apply(*args, **kwargs) # type: ignore[misc] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/venv/lib/python3.12/site-packages/bitsandbytes/autograd/_functions.py", line 520, in forward output = torch.nn.functional.linear(A, F.dequantize_4bit(B, quant_state).to(A.dtype).t(), bias) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/venv/lib/python3.12/site-packages/bitsandbytes/functional.py", line 1067, in dequantize_4bit return backends[A.device.type].dequantize_4bit( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/venv/lib/python3.12/site-packages/bitsandbytes/backends/cuda.py", line 644, in dequantize_4bit lib.cdequantize_blockwise_bf16_nf4( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gabriel-filincowsky commented 4 weeks ago

@initialxy, Thank you! I really appreciate you taking the time to reply.

Jeremy8776 commented 4 weeks ago

So we all still waiting on a fix?

hartmark commented 4 weeks ago

So we all still waiting on a fix?

It's a bit outside my coding skills. Someone with more technical insight could rise a ticket at bitsandbyte's repo and see if someone could help out

hartmark commented 3 weeks ago

Any news on this one?

Jeremy8776 commented 3 weeks ago

Any news on this one?

Ive been using forge for now, also a lot quicker

hartmark commented 3 weeks ago

Any news on this one?

Ive been using forge for now, also a lot quicker

Interesting, do you have any details on how much faster? and what is your hardware specs?

Jeremy8776 commented 3 weeks ago

Specs: 3090FE [24gb], 32gb RAM, Ryzen 5 3600x

Speeds on comfy can take me between 1min - 2 mins a gen. Not using nf4 model due to the above bug.

Speeds on Forge can take me 30s-40s, 2mins inc hiresfix and 2xupscale . Using NF4 model

Both with 35 steps, Euler, Normal

hartmark commented 3 weeks ago

Thanks for details, I'm going to try out forge a bit and see how it will work.

I have AMD 7800 xt 16GB Ryzen 9 5900 12c 32GB ram

I'm having issues with lockups due to vram not getting swapped out correctly.

I got flux schnell working with GGUF and have 30-50s to generate 1024x1024

https://github.com/city96/ComfyUI-GGUF

I have tried getting controlnet working but i get consistent lockups with it https://www.reddit.com/r/StableDiffusion/comments/1euz2a9/comment/lio2fte/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button