Closed patientx closed 7 months ago
@laksjdjf it's your code I believe, can you take a look at it?
We can revert that change without issues.
@patientx can you post your startup message from comfyui?
sorry I misunderstood... so it's 1ed4f93 that is not working.
did you update comfyui?
I always use latest version of comfyui, always update at start with git pull. Btw at first I tried using previous commits of comfyui and it was around 30 commits before that the extension at latest version worked, so I thought comfy is the main app and the latest additions are more important if I can fix the problem with the node. So here we are :)
Here is comfyui startup , https://pastebin.com/a6AUGPNu
For the record I tried deleting all the other custom nodes except this one, still had the same problem.
comfyui updated and they say they resolved the ipadapter issue, so I updated both comfy and this node to the latest, nope still the same issue.
the latest version overrides timestepping if it's not supported, so I really don't know how to help. If it doesn't give any error I would tend to think of some drivers issue maybe? Or maybe you need to upgrade also the python libraries. Can you try with the latest nightly (instead of the stable one?)
Hi there i can vouch that this is not a driver problem as today i was still using your nodes fine, until i updated the nodes including comfyui to the latest version. What basically happens is that cmd windows shows that inference is starting but then it literally keeps sitting there on the first step until after a while comfyui just crashes.
If it helps, i am on this release: ComfyUI Revision: 1758 [6b769bca] | Released on '2023-11-30'. And i am also using an AMD gpu. As i said before everything was still working fine today until i updated to the latest commit.
I feel this is more of a comfyui issue than IPAdapter, but I'll check with comfyanonimous
do you get any error at all?
I feel this is more of a comfyui issue than IPAdapter, but I'll check with comfyanonimous
do you get any error at all?
No, no error. Everything loads fine until, the part where it starts to generate the image, then it it just hangs on step 0 of 40. But as soon as i bypass the ip adapter node, everything works fine again. And it also doesn't matter if i use time stepping or not, neither if batch unfold is on or off. It does the same thing.
if you do not use timestepping or batch unfold the code in unchanged. It's just an IF statement so if you don't use either everything is the same as before.
It is important that you refresh the browser cache (or use an incognito window), delete the IPAdapter nodes and recreate them.
If the execution halts and then crashes without errors it is very likely a driver issue.
I talked to Comfyanonymous this morning that confirmed my suspects:
Unfortunately DirectML is very unstable on Windows and it would require a lot of debugging to reach the source of the problem and it could be even related to a specific GPU.
Anyway I made a small change just now that could help (don't hold your breath though). You can give it a try.
I deleted the node manually, used git clone to copy the repository with the change you made. Sadly no dice, still the same thing. One question, is it normal that in cpu mode it also doesn't work ? Seems like AMD users are getting the short end of the stick again. But even so, thank you for this great node you made. It was fun to work with. Keep doing the good work. Thumbs up to you sir.
in CPU mode could be just some kind of overflow, I'll see if I can replicate
In cpu mode when i use the standard quad-cross-attention, it hangs the the same as in directml mode. When i try to use split-cross-attention, it will complain about an autocast 'cuda' or 'cpu' error unless i force unet-bf16. But then it hangs the same again lol. Just for the info, my cpu is a Ryzen 5600G.
just use the old commit , everything is working there
just use the old commit , everything is working there
Hi there, do you mean the commit of ipadapter plus before the timestepping feature ? If so, how do i go back to the version before the commit ?
Never mind, i found out how to do it. Thanks for the suggestion. It's still weird though how this commit works fine but since the timestepping commit inference just refuses to work. More so when i use a controlnet node tthat has timestepping feature that works pretty much fine.
does 1ed4f93 work?
if you are on c28a044 can you place this code at line 388 and tell me what it says in the command prompt? (delete them afterwards)
print(model.model.model_sampling.percent_to_sigma(0.0))
print(model.model.model_sampling.percent_to_sigma(1.0))
Thanks (sorry I don't know how else to test this)
if you are on c28a044 can you place this code at line 388 and tell me what it says in the command prompt? (delete them afterwards)
print(model.model.model_sampling.percent_to_sigma(0.0)) print(model.model.model_sampling.percent_to_sigma(1.0))
Thanks (sorry I don't know how else to test this)
Hi there, in which file should i paste the codes ?
IPAdapterPlus.py
please note you need to keep the indentation! of the original file
So these are the two codes i see at line 388:
def apply_ipadapter(self, ipadapter, model, weight, clip_vision=None, image=None, weight_type="original", noise=None, embeds=None, attn_mask=None, start_at=0.0, end_at=1.0):
Should i overwrite them ?
I don't know what version you are on... just put them above the line
self.dtype = model.model.diffusion_model.dtype
at the same indentation level, do no remove anything.
I don't know what version you are on... just put them above the line
self.dtype = model.model.diffusion_model.dtype
at the same indentation level, do no remove anything.
I am testing commit c28a044 that doesn't work.
if you can add those lines it would help me understand, otherwise we can try on discord
PS: you need to stop comfy and restart
if you are to add those lines it would help me understand, otherwise we can try on discord
Oh sorry, c28a044 does work yes but 1ed4f93 does not. I will go back to the working commit again and paste the codes. Sorry for the confusion.
you can put them in any commit you are on, it doesn't matter. I just need to know what values are returned. You should see two long float numbers in the command window
Should i also do one for the commit that doesn't work ?
no, that's fine. thanks. the numbers are correct, so I'm really out of ideas here.
no, that's fine. thanks. the numbers are correct, so I'm really out of ideas here.
Haha, i understand. Even so thanks again. At least i can still use your node. Even if i have to stay on an older version for now. It gets the job done. So again, thumbs up to you and keep doing good work.
unless anyone is willing to give me access to their PC remotely...
I'd like to but i can't really risk it as i also use my PC for work. Not just for hobbies. So i hope you understand.
no worries, I wouldn't trust a rando on my own PC either
if you are to add those lines it would help me understand, otherwise we can try on discord
Oh sorry, c28a044 does work yes but 1ed4f93 does not. I will go back to the working commit again and paste the codes. Sorry for the confusion.
Hey there, thanks for posting which commit works, had the same issue also ok AMD GPU on windows.
In case its helpful or related I get this error when using on AMD too:
ComfyUI\comfy\model_sampling.py:74: UserWarning: The operator 'aten::frac.out' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at D:\a_work\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:17.) w = t.frac()
thanks for the additional info, but I already fallback to cpu so that shouldn't be a problem.
the only way for me to fix this issue is if someone gives me access to a windows machine with AMD or at the very least we could try a discord screensharing with someone who knows how to use a text editor and at least a little python.
Echo'ing the "I'm on AMD, same issue right after updating IpAdapter_plus, bypassing it works fine (no error messages, just a hang on step 0)"
rolling back to c28a044 temporarily fixed it for me.
Indecently - AMD are getting ready to ROCm Windows (just saw it on AMDs homepage) - So hopefully no more DirectML restrictions on windows platform and AMD based AI should fly like nVidia..!
Here is weird one for you guys :
I updated all extensions as usual with manager and before rolling back to c28.. I decided to try once again, behold it worked with the workflow I have at the time.
And then I tried to build a default workflow with a clean sheet and .. it didn't work ??? What I had with the other workflow was I was experimenting with kohya's hires fix AND hypertile (to be able to generate images at higher quality somehow faster)
Weird thing 1: I can't generate bigger than 512x960 (I normally generate at 512x768 or 512x512, with kohya I am able to twice that very easily but with this workflow it doesn't generate when I enter for example 768x960 ,512x1024 , or 1024x1536 (two times) LIKE in a1111 it is like my memory isn't enough to generate higher than that)
Weird thing 2: BOTH have to be used , if I disable one , the usual happens, no generation at start.
Weird thing 3: Kohya doesn't seem be working at all because there is no speed change in generation, but hypertile is definetly working if I change the tile size speed and end quality changes , (128 and low lowers the quality)
So this combo somehow lets me use ipadapter again.
Maybe someone more informed could get something out of this situation for solving the problem.
weird workload that enables ipadapter on amd.json
TLDR ; when we use both kohya's hires fix and hypertile , something happens and ipadapter latest version works with amd
rolling back to c28a044 temporarily fixed it for me.
Indecently - AMD are getting ready to ROCm Windows (just saw it on AMDs homepage) - So hopefully no more DirectML restrictions on windows platform and AMD based AI should fly like nVidia..!
Unfortunately MIOpen (which is the part of ROCm that pyTorch depends on) is further behind than ROCm in general. There is active work going on, but I think we'll be lucky to get a ROCm enable windows pytorch before 2025.
can any of you try the latest update?
can any of you try the latest update?
This error comes now :
Requested to load CLIPVisionModelProjection
Loading 1 new model
ERROR:root:!!! Exception during processing !!!
ERROR:root:Traceback (most recent call last):
File "D:\ComfyUI\execution.py", line 153, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "D:\ComfyUI\execution.py", line 83, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "D:\ComfyUI\execution.py", line 76, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "D:\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\IPAdapterPlus.py", line 464, in apply_ipadapter
image_prompt_embeds, uncond_image_prompt_embeds = self.ipadapter.get_image_embeds(clip_embed.to(self.device, dtype=self.embeds_dtype), clip_embed_zeroed.to(self.device, dtype=self.embeds_dtype))
File "D:\ComfyUI\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\IPAdapterPlus.py", line 200, in get_image_embeds
image_prompt_embeds = self.image_proj_model(clip_embed)
File "D:\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "D:\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\IPAdapterPlus.py", line 41, in forward
clip_extra_context_tokens = self.proj(image_embeds)
File "D:\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "D:\ComfyUI\venv\lib\site-packages\torch\nn\modules\container.py", line 217, in forward
input = module(input)
File "D:\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "D:\ComfyUI\venv\lib\site-packages\torch\nn\modules\linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 must have the same dtype
Even with newly built, just basic workflow. Previous versions at least worked with the trick I mentioned above. (if I use kohya hires fix + hypertile ; ipadapter works. I explained it a few posts up)
what about now?
what about now?
That mat1 .. error disappeared , same problem as before, but still working with my method at least
Just gave it a try with https://github.com/cubiq/ComfyUI_IPAdapter_plus/commit/cb1c5e8a1af794a605092da78218ddaa06615235, seeing the same issue where the IP Adapter step runs fine, but when it gets to the sampler it just get stuck at 0%.
Requested to load BaseModel
Loading 1 new model
0%| | 0/30 [00:00<?, ?it/s]
I run some tests, edited the py script and I came to one conclusion - directml doesn't like the "sigma_start" and "sigma_end". Once I removed them from the script (latest commit was 2023-12-17) it started to work again:
Obviously when I change start_at and end_at in the webui then it doesn't do anything.
I run some tests, edited the py script and I came to one conclusion - directml doesn't like the "sigma_start" and "sigma_end". Once I removed them from the script (latest commit was 2023-12-17) it started to work again:
Obviously when I change start_at and end_at in the webui then it doesn't do anything.
Like all of them ? Can you share the edited file ?
I run some tests, edited the py script and I came to one conclusion - directml doesn't like the "sigma_start" and "sigma_end". Once I removed them from the script (latest commit was 2023-12-17) it started to work again:
without the code this doesn't help much, what if you just rename the variables
Okey, I pin-pointed it to two lines to edit. It's these two.
After removing them like in the screenshot then it works fine. The "start_at" and "end_at" parameters gets ignored but it generates the image and I assume it applies it at the whole generation from 0 to 100%.
mh I believe the problem is the hardcoded value.
please print(extra_options["sigmas"][0].item()) at line 236 then before line 250 print(sigma_start, sigma_end)
PC : windows 10, 16 gb ddr4-3000, rx 6600, using directml with no additional command parameters.
Updated today with manager , and tried my usual workflow which has ipadapter included for faces, when it comes to actually generating the image , it just stops there, nothing happens. I have to close the command prompt and relaunch the app.
Loading 1 new model 0%| | 0/8 [00:00<?, ?it/s]
I was mainly using lcm so tried normal sd1.5 models , the same results. Tried different samplers , same. If I get ipadapter out of the workflow everything works correctly. So I am not updating daily , thought something in the middle changed, tried every commit and found out that it works up until "timestepping, fix" , after that the problem is here. So now I am on c28a04466b17d760a345aea41d6a593c0a312c95. (last one before that fix) . Everything works as it was before.