kijai / ComfyUI-KJNodes

Various custom nodes for ComfyUI
GNU General Public License v3.0
680 stars 77 forks source link

I'm trying to get working patch model patcher + torch compile #130

Open elen07zz opened 3 weeks ago

elen07zz commented 3 weeks ago

I'm trying to get working patch model patcher + torch compile But i'm getting this error: Using flux dev Also gguf does not work. It seems that the only model working is flux dev fp8.

FileExistsError: [WinError 183] Cannot create a file when that file already exists: 'C:\\Users\\user\\AppData\\Local\\Temp\\torchinductor_user\\cache\\.15576.4520.tmp' -> 'C:\\Users\\user\\AppData\\Local\\Temp\\torchinductor_user\\cache\\7be1381fc4436b17a13ee997bde4d412222077323aee7972cc9a2d6c30a94e17'

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information

You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True

With gguf I'm getting a different error: Dimension out of range (expected to be in range of [-1, 0], but got 1)

Ratinod commented 3 weeks ago

FileExistsError: [WinError 183] fix: https://github.com/pytorch/pytorch/issues/138211#issuecomment-2422975123

gguf for some reason have many troubles with triton on windows like here https://github.com/kijai/ComfyUI-CogVideoXWrapper/issues/200 and where exactly the problem is currently unknown

elen07zz commented 3 weeks ago

FileExistsError: [WinError 183] fix: pytorch/pytorch#138211 (comment)

gguf for some reason have many troubles with triton on windows like here kijai/ComfyUI-CogVideoXWrapper#200 and where exactly the problem is currently unknown

It seemed to work at first but then it recreated the cache i.e. long waits again. And then another error: `

After modifying codecache.py and adding the fix:

shutil.copy2(src=tmp_path, dst=path)
os.remove(tmp_path)

I'm getting this error:

File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\codecache.py", line 1132, in call return self.current_callable(inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\compile_fx.py", line 944, in run return model(new_inputs) ^^^^^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Temp\torchinductor_User\ap\capak3fjfe6h3meijenplulmcimxpwcb4lldpdgeqwqxygoi4syj.py", line 4538, in call extern_kernels.mm(buf14, reinterpret_tensor(arg6_1, (256, 3072), (1, 256), 0), out=buf15) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm_out_out)

`

kijai commented 3 weeks ago

This is all very experimental and probably not compatible with lots of other stuff, GGUF models included. I've only tested on fp8 (fast mode) while using my compile node which limits the compilation, and it worked well even with multiple LoRAs.

image

codexq123 commented 3 weeks ago

same issue, but not working for fp8 too

scottmudge commented 3 weeks ago

Just commenting that I get this error as well, with FP8 models:

... venv\Lib\site-packages\torch\_functorch\_aot_autograd\functional_utils.py", line 251, in gen_alias_from_base

out = torch._functionalize_apply_view_metas(

      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

Using Torch 2.5.1 with Triton 3.1.0 on Windows 11, using SamplerCustomAdvanced node with BasicScheduler and BasicGuider.