Open tyDiffusion opened 1 month ago
Alright, I've been diving into the codebase to track down the cause of these problems and there are a few issues at play:
1) The improper call to the ImageEmbed constructor is due to this past commit: https://github.com/Mikubill/sd-webui-controlnet/pull/2725
ImageEmbed.bypass_average was removed, but not all calls to ImageEmbed were updated - so the fix there is as simple as removing the unnecessary bool argument in the problematic function calls.
2) In plugable_ipadapter.py, the preprocessor_outputs argument of the hook function has types that are not accounted for in the function logic.
In my experimentation, depending on my API inputs and whether or not I'm using AnimateDiff and/or multi-image input, preprocessor_outputs can have at least 3 different types: ImageEmbed, dict and tuple. However, the current function logic does not account for preprocessor_outputs having type ImageEmbed, resulting in an error when the average_of function is called on it.
I'm not sure if the core issue here is the conversion conditional in this function, or something upstream (why is preprocessor_outputs already an ImageEmbed object when passed to that function?)...I will continue to dig further and report my findings, assuming @Mikubill doesn't chime in sooner.
Ah...in line 958 of controlnet.py, in the ad_process_control function, the value returned by that function (which ends up as "preprocessor_inputs" in the hook function) is being assigned as:
c = ImageEmbed(c_cond, c.uncond_emb) #, True) (invalid bool argument removed in ImageEmbed constructor)
But only if AnimateDiff is enabled and the ControlNet is IP-Adapter - that explains why it's not being passed in as a dict/list in the case where it's causing a downstream error in the hook function.
Same goes for further down in the ad_process_control function, if keyframes are found. So that function is the main source of the type mismatch.
Ok, with the fixes to the ImageEmbed calls listed above, and the following change to the conditional logic of the hook function in plugable_ipadapter.py, IP-Adapter/AnimateDiff keyframe-based prompt travel is working, as well as single/multi-image inputs in regular image generation:
if (isinstance(preprocessor_outputs, ImageEmbed)):
self.image_emb = preprocessor_outputs
elif (isinstance(preprocessor_outputs, dict)):
self.image_emb = self.ipadapter.get_image_emb(preprocessor_outputs)
elif (isinstance(preprocessor_outputs, tuple)):
self.image_emb = ImageEmbed.average_of(*[self.ipadapter.get_image_emb(o) for o in preprocessor_outputs])
Is there an existing issue for this?
What happened?
I am testing AnimateDiff and IP-Adapter, and experimenting with multiple IP-Adapter inputs (in my experimentation, I am adding X images to IP-Adapter, where X is the total number of AnimateDiff output frames) - all this is done through the API only.
For the command line API, I am adding images to my IP-Adapter ControlNet like this:
If I add multiple IP-Adapter images, I satisfy the following condition in controlnet.py
and then
That's fine and good - it tells me the code flow is correct as my inputs are being recognized as an AnimateDiff batch process for a ControlNet that accepts multiple inputs (IP-Adapter).
However, the following line in controlnet.py generates an error:
Looking into ipadapter_model.py, I think I see why the error occurs, because the ImageEmbed class has no field for the boolean value in the argument list. However, when I modify the above line like this:
The code execution progresses further, but shortly after I get another error:
So something is clearly not working...however, maybe I don't understand the types of inputs required for a multi-image IP-Adapter setup w/AnimateDiff?
The goal here is to have an IP-Adapter input-image-per-frame, so that the result of each frame in AnimateDiff is tuned to the corresponding IP-Adapter input image. The same works with other ControlNets (ex: I provide identical batch image input to a Depth ControlNet, and the AnimateDiff result will use each depth input for the corresponding animation frame)...it's just IP-Adapter that fails....
Steps to reproduce the problem
See above
What should have happened?
See above
Commit where the problem happens
N/A
What browsers do you use to access the UI ?
No response
Command Line Arguments
List of enabled extensions
N/A
Console logs
Additional information
No response