blepping / comfyui_jankhidiffusion

Janky implementation of HiDiffusion for ComfyUI
Apache License 2.0
110 stars 7 forks source link

error with controlNet #3

Closed zismylove closed 4 months ago

zismylove commented 4 months ago

take error when use it with controlNet.

\raunet.py", line 60, in hd_forward_timestep_embed transformer_options = args[1] if args else {}

blepping commented 4 months ago

thanks for the report. i haven't done any testing with controlnet.

i will look into how difficult it will be to add that feature when i get a chance. (or maybe it's just a simple fix.)

blepping commented 4 months ago

i believe this issue should be fixed now. note that controlnet still doesn't work properly because the conditioning size doesn't match the scaled tensor size. unfortunately, this issue also applies to Deep Shrink.

at least it doesn't crash now. i check to see if there's a reasonable way to fix the conditioning, but i'm not sure whether that will be possible.

setothegreat commented 4 months ago

i believe this issue should be fixed now. note that controlnet still doesn't work properly because the conditioning size doesn't match the scaled tensor size. unfortunately, this issue also applies to Deep Shrink.

at least it doesn't crash now. i check to see if there's a reasonable way to fix the conditioning, but i'm not sure whether that will be possible.

To preface, I really don't understand any of the intricacies of the subject since I am not a programmer, and I don't know if you'd already be aware of this or not.

That being said, ControlNets only seem to malfunction in the first half of a generation. You can see this in action by checking the console and adjusting the denoise value in the KSampler; once you set the denoising strength to 0.5 or lower, the "control could not be applied" warning stops popping up, and the Controlnet seems to apply without much issue*.

At least some controlnets, that is. The depth controlnet doesn't produce much, if any, issues, but the softedge controlnet produces incredibly bad results. Not sure why that would be the case but might be worth mentioning.

*Oddly enough, if you instead set the denoise strength to 0.55 so that only a single "control could not be applied" warning pops up, and set the controlnet to start after the step this warning would pop up, the image seems to generate with far, far better results than it does with no warnings popping up. Again, not a programmer so can't say why this is the case, but any info should be valuable, right?

blepping commented 4 months ago

@setothegreat

That being said, ControlNets only seem to malfunction in the first half of a generation.

this makes sense. if you have (for example) start_time 0.0, end_time 0.35 (the defaults) then that means the RAUNet scaling effects end at ~35% of sampling. at that point, the controlnet conditioning should work again. also, i'm not sure if it just doesn't function at all earlier than that or whether the conditioning just doesn't get applied for the scaled layers.

once you set the denoising strength to 0.5 or lower, the "control could not be applied" warning stops popping up, and the Controlnet seems to apply without much issue*.

yep, but at that point you're starting to generate after you configured the RAUNet effect to end. so at that point, you might as well just not use RAUNet - it's not doing anything. also, in general you probably won't get very good results starting new generations with denoise much less than 1.

you can kind of think of it like if an artist has 10 minutes to draw something, and as time passes they get to make progressively smaller and smaller changes to the image. starting a new generating at 0.5 denoise is like giving them 5 minutes and saying they only get to make fairly small changes and toward the end it would just be minor added details, touching up small stuff. this is a simplification, of course.

the concept of deep shrink and RAUNet is based on the idea that the model sets up major details like how many limbs a character has in the initial stage of sampling and the result of it is mostly refinement and adding details. with controlnet guidance, you might not really need stuff like deepshrink/hidiffusion since controlnet is helping the model figure out major details.

setothegreat commented 4 months ago

@blepping It seems to be the case that the ControlNet is doing something to influence the image generation even though it's saying it could not be applied. Reasoning behind this is that increasing the denoise value in increments of 0.05 with a ControlNet applied will steadily distort both the composition of the image, and will drastically impact the output color.

In my testing with the depth ControlNet, at 0.6 denoising the color will start to be tinted orange; at 0.65 the image will have a heavy orange tint, and anything over 0.7 will be almost entirely some shade or dark orange.

By comparison, at 0.6 the composition of the image will start to distort a small amount when compared to the ControlNet, but will still largely retain the composition of the original; at 0.65 the composition will be distorted more but still legible, almost like you're looking at a caricature of the original ControlNet image input; at 0.7 it will be incredibly distorted, to the point that the original composition will be nearly unrecognizable, and at anything above 0.75 the image will just be an abstract mess with no adherence to the ControlNet nor text prompt.

also, in general you probably won't get very good results starting new generations with denoise much less than 1.

I should additionally clarify that my testing has been with a combination of Image2Image and Text2Image. Of course the lower denoise value doesn't cause generation issues with an Image2Image latent space; it just means that the original image will be more retained than not. Along with this, even if I do use a low denoise value like 0.55 the inclusion of both the MSW-MSA and RAUNet nodes will produce an image that is far more aesthetically pleasing than if the same workflow is run with the same parameters without those nodes enabled (on an Image2Image workflow at least). And yes, I did make sure the nodes were disabled by restarting ComfyUI and checking the console for those warnings.

blepping commented 4 months ago

It seems to be the case that the ControlNet is doing something to influence the image generation even though it's saying it could not be applied.

the error is likely just when it tries to apply the controlnet conditioning to the blocks that are scaled. so it still may have an effect, it just won't be applying to all the blocks. i don't really know how this affects the results.

In my testing with the depth ControlNet, at 0.6 denoising the color will start to be tinted orange; at 0.65 the image will have a heavy orange tint, and anything over 0.7 will be almost entirely some shade or dark orange.

hard to comment on this since i don't know what settings you're using. if you have the RAUNet node on the defaults then all the effects will end at ~45% (ca_end_time is 0.35, end_time is 0.45). note that's percentage of sampling not necessarily percentage of steps.

I should additionally clarify that my testing has been with a combination of Image2Image and Text2Image.

i see. you generally shouldn't need deep shrink/RAUNet type stuff when doing img2img because the image you're using will act as the guidance and determine major features like how many legs, eyes, etc. generally img2img is just refining an existing image.

the attention node can definitely still be useful though.

setothegreat commented 4 months ago

Wasn't aware that the MSW node wasn't causing the ControlNet issues since I was using them together most of the time and forgot to restart when testing them individually, that's my bad. Last comment I'll post just to correct something I said earlier since I'm relatively fine just using the MSW node on it's own:

The issue with the colors becoming more and more orange as the denoise parameter is increased seems to be an issue with the SDE sampler and Karras I was using, as it doesn't occur as strongly on the non-SDE samplers, and doesn't occur at all when using a scheduler other than Karras. Never had that issue with the SDE samplers nor Karras scheduler before, but it seems to be an issue with my particular generation and ControlNet parameters rather than your particular nodes, at least from what I can tell.