Kosinkadink / ComfyUI-Advanced-ControlNet

ControlNet scheduling and masking nodes with sliding context support
GNU General Public License v3.0
619 stars 61 forks source link

The current ControlNet Advance backend code is not complete, there is no corresponding code for the KSampler handling of the pose or the forward prompt (in the source code named "data_api_packing.py"), this is equivalent to having no effect. In the current version, is it not possible to specify a multiple skeleton pose image, so that the protagonist can follow this pose control in the video or photo? If you do not have this code, probably when can use this code? #26

Closed DanielWGAnimateDiff closed 11 months ago

DanielWGAnimateDiff commented 1 year ago

The current ControlNet Advance backend code is not complete, there is no corresponding code for the KSampler handling of the pose or the forward prompt (in the source code named "data_api_packing.py"), this is equivalent to having no effect. In the current version, is it not possible to specify a multiple skeleton pose image, so that the protagonist can follow this pose control in the video or photo? If you do not have this code, probably when can use this code?

DanielWGAnimateDiff commented 1 year ago

src/custom_nodes/comfyui_controlnet_aux/src/custom_mmpkg/custom_mmcv/cnn/bricks/wrappers.py This class is rewriting the positive prompt word and accessing the pose configuration-related class Worthy of attention class ConvTranspose2d(nn.ConvTranspose2d): Subclass of this class And class ConvTranspose3d(nn.ConvTranspose3d): This subclass, these two subclasses These two subclasses have the following two methods, respectively, to generate the forward prompt and the reverse prompt They are: forward () Generates a forward prompt word backward () generates backward prompt words

Author, take a look at this information, whether it is useful to you. I found this by searching the code myself, maybe you can call it directly.

DanielWGAnimateDiff commented 1 year ago

Eagerly hope that your piece of openpose skeleton, parsing supplement write forward prompt word code, can complete the upload as soon as possible. This feature is so valuable. Otherwise it's only half done. Thank you for your hard work.

Kosinkadink commented 1 year ago

Hey, thanks for the report. As far as I'm aware, the control net object itself does not modify conditioning at all; if you were to replace the Advanced ControlNet nodes with vanilla ControlNet nodes, you should get the exact same results (given a simple workflow, and not using a sliding window context for AnimateDiff).

Unless something in vanilla controlnet code changed, Im guessing the issue lies elsewhere, although I'd appreciate I'd you provided screenshots of what exactly in the code you mean, and also screenshots of your workflow, especially the controlnet node arrangement, any related controlnet nodes, and your animatediff node arrangement.

For AnimateDiff, when the input latents exceed the context_length, it begins to use a sliding context window, which requires that the sampling code properly selects the sub-idxs of all conditioning-related tensors. When no sliding context is used (probably 16 frames or less in your case), everything related to sampling is mostly vanilla. The AnimateDiff repo for comfy that I manage is called "ComfyUI-AnimateDiff-Evolved." The sampling code there should, hypothetically, divide the conditioning properly for sliding window purposes.

So, to verify if it does or not, here are some things I'd like you to try: 1) which AnimateDiff nodes are you using in your workflow? If they are not from AnimateDiff-Evolved, I'd recommend switching to AnimateDiff-Evolved. 2) if you are using AnimateDiff-Evolved already, run your workflow with just 16 frames with no context options attached, and see if the controlnet works as intended. If it works properly at 16 frames, but not with more than that when using context options then there might be an issue with how the AnimateDiff-Evolved sampling code subdivided the conditioning, and we can go from there. (Note: currently when using sliding windows, the only available context scheduler attempts to loop the latents).

DanielWGAnimateDiff commented 1 year ago

Thank you for your reply, I follow your prompt, carefully check and repeatedly test many times, confirm the following points: 1) I used "ComfyUI-AnimateDiff-Evolved.", all animate related words are "ComfyUI-AnimateDiff-Evolved.", but no error was reported, that is, the pose of the main character was not controlled by the pose diagram; 2) I changed the operation to ksampler, and no error was reported, even though the pose of the protagonist was not controlled by the pose diagram; Due to the github page I can't map, and I don't know how to send you a screenshot of my workflow. The bottom line: No matter what type of KSampler I use, it works fine, but even the pose of the main character is not controlled by the pose diagram;

DanielWGAnimateDiff commented 1 year ago

I have examined the ksampler and AnimateDiff source code carefully and there is no special treatment for the Pose skeleton diagram. It only controls the posture of the protagonist based on "Positive Prompt" and "negative Prompt". There is no source code for calling DWpose or related to Pose. Another thing I don't understand is that, as you say, ControlNet doesn't modify Positive Prompt and negative Prompt, So why is your nodes.py class def apply_controlnet(self, positive, negative, control_net, image, strength, start_percent, end_percent, mask_optional=None): method, but it needs to input positive and negative two parameters, and also output these two parameters, and we look at this code about mask: ### if mask_optional is not None: if is_advanced_controlnet(c_net):

if not in the form of a batch, make it so

if len(mask_optional.shape) < 3: mask_optional = mask_optional.unsqueeze(0) c_net.set_cond_hint_mask(mask_optional) c_net.set_previous_controlnet(prev_cnet)**** Obviously, this is changing the "Positive Prompt". There's one more thing I don't understand: this apply_controlnet (...) In fact, the method does not make any meaningful processing of the Image, so why pass it in? The third don't understand: you this method apply_controlnet (...) method The return value (out[0], out[1]) is actually the positive passed in, and nothing can be changed about negative. From the current source code you can see the nodes.py class def apply_controlnet(...) There is no meaningful manipulation of the pose control approach. That's why I feel like there's a code missing here. The above is just my analysis and understanding, you see, how can I solve the problem that pose control is not effective. This problem has stuck with me for a long time, forcing me to read the code line by line to analyze.

Kosinkadink commented 1 year ago

Before we get into the nitty gritty of controlnet code, let's get back to your initial claim of a specific controlnet not working.

You mentioned that you can't get a subject to follow the pose of the character, but on my end (and on the end of everyone else that is using it), you can. Here is an example workflow that loads in existing pose data per frame: https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved#txt2img-w-initial-controlnet-input-using-openpose-images--latent-upscale-w-full-denoise

Just to verify that nothing broke, I tried running those workflows again, and got the expected results. The code is working as intended, and so is the vanilla ControlNet code. Something might just be wrong with your workflow, or how you are extracting the pose, etc., so it would be for the best if you just show your workflow here. You can just take a screenshot, and directly upload the image into your comment, like this (NOTE: this screenshot does not contain the PNGinfo, unlike the ones in the README): image

It will automatically upload the image into the comment. You can also save the json of the image, and attach it into your comment.

DanielWGAnimateDiff commented 1 year ago

Add a point: In my test, I also specifically tested the AnimateDiff-Evolved already more than 16, equal to 16, less than 16, three case situations, the results are unable to control the posture of the protagonist, pose controlNet can not be effective

Kosinkadink commented 1 year ago

And to follow up on your comment, the reason that the KSampler does not care about the skeleton diagram is because that's not how controlnets work. Controlnets take in input images, and at sampling time, use those input images to apply additional conditioning onto the latents. The Apply nodes just attach the necessary information onto the conditioning inputs, so that they can be used during sampling.

And ComfyUI has two options for adding the controlnet conditioning - if using the simple controlnet node, it applies a 'control_apply_to_uncond'=True if the exact same controlnet should be applied to whatever gets passed into the sampler (meaning, only the positive cond needs to be passed in and changed), and if using the advanced controlnet nodes, it applies the controlnets to both positive and negative conds, and sets 'control_apply_to_uncond'=False, since it does not need to forcefully copy controlnet info to the unconds at sampling time.

Kosinkadink commented 1 year ago

If you show your workflow, I should be able to see if there are any issues with your setup.

DanielWGAnimateDiff commented 1 year ago

截圖1 截圖2

Kosinkadink commented 1 year ago

Gotcha, thanks. A few things to change:

  1. Make sure the negative out of the controlnet also goes into the KSampler - the advanced ControlNet nodes specifically flag the conditioning to not be copied over to the negative, so you'll need to wire negative prompt to the KSampler from there.
  2. You don't nee to use the 'AnimateDiff Loader [DEPRECATED]' node at all - it is deprecated, meaning no additional development is happening to it, and you should just use the AnimateDiff Loader alone. Once you remove that node, you can then just plug the latents directly into the KSampler, and the animation length is be determined by that amount of latents passed into the sampler. Also, im guessing you were just checking non-AD behavior, but be sure to plug in the model output from the AnimateDiff Loader node into the KSampler to use AD.
  3. This one is not gonna cause or solve any issues, but you can use the ComfyUI's built in Load Checkpoint node instead of the w/ Noise Select. 2 months ago, that was required, but for a long time now the beta_schedule is determined in the AnimateDiff loader to be used at sampling time, so w/Noise Select loader is unneeded.
DanielWGAnimateDiff commented 1 year ago

Thank you for your careful guidance. According to the five modification points you proposed, I modified them one by one, but the effect still could not control the posture of the protagonist. I don't think I can even make a full body. 截圖3

Kosinkadink commented 1 year ago

Hmm, a couple weeks ago, there was an update that required Advanced-ControlNet to be updated. I would double check that you have ComfyUI, Advanced-ControlNet, and AnimateDiff-Evolved to the newest versions. Also, leave context_length at 16 - it ensures that even if frames exceed 16 frames, it will pass frames into AD 16 at a time. Also, the clip you are passing into the conditioning is from the LoRA node, but you are not using hte model output from the LoRA at all, which, while probably is not the cause of the openpose not being applied the way you want, is not the intended way to do that. If you want to use the LoRA, you need to use the model + clip outputted from it.

For current debugging purposes, try to use the example workflow I linked earlier. The openpose for that example is also in the readme as a gif - you can download it, and use the Load Video node to get the frames from it. I know it works, as intended, so we can see if something is wrong on your end in your install that way.

DanielWGAnimateDiff commented 1 year ago

I manually updated ComfyUI, comfyui-ctrolnet-advance,AomfyUI-AnimateDiff-Evolved to confirm that they are the latest versions. Restart the test, but still can't control the main character's posture. When I change the skeleton diagram to GIF, the Load Images (Path) will report an error directly. The error message is as follows: File "F:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-VideoHelperSuite\videohelpersuite\load_images_nodes.py", line 44, in load_images raise FileNotFoundError(f"No files in directory '{directory}'.")

Summary: I still haven't solved the problem of relying on the skeleton image for the main character's pose, even though I'm using the latest version. I don't know this address how to download workflow Json file https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved#txt2img-w-initial-controlnet-input-using-openpose-images--lat ent-upscale-w-full-denoise 截圖4

DanielWGAnimateDiff commented 1 year ago

Author, following the above question, do you confirm that the latest version of "Load Video node" supports GIF format? It can't be, the graph type used for checking in the latest code is clearly written in the code: class FolderOfImages(data.Dataset): Line 262: IMG_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif', '.tiff', '.webp'}

Kosinkadink commented 11 months ago

Load Video node loads gifs just fine, and openpose works just fine with Advanced-ControlNet (and vanilla comfy ControlNets) - not sure if you've already solved your issue, but it must have been a workflow problem. Reopen if you still have a problem, but I'm closing the issue since there is no problem with the nodes as far as I am able to test.

DanielWGAnimateDiff commented 11 months ago

Thank you. My problem is solved

Jedrzej Kosinski @.***> 于2023年12月5日周二 19:33写道:

Closed #26 https://github.com/Kosinkadink/ComfyUI-Advanced-ControlNet/issues/26 as completed.

— Reply to this email directly, view it on GitHub https://github.com/Kosinkadink/ComfyUI-Advanced-ControlNet/issues/26#event-11151993722, or unsubscribe https://github.com/notifications/unsubscribe-auth/BDP7STUJ23NNVVI6ZFGLIYTYH4A73AVCNFSM6AAAAAA7HIVGWWVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJRGE2TCOJZGM3TEMQ . You are receiving this because you authored the thread.Message ID: <Kosinkadink/ComfyUI-Advanced-ControlNet/issue/26/issue_event/11151993722@ github.com>