How to do img2img with this?

Skrownerve commented 1 year ago

Hey there, I love this! I could not find the workflow for the last example on the readme. I tried to recreate it but I do not have the option to specify frame_number in the current AnimateDiff Loader. This seems to prevent me from making animations using an existing image as a base.

If this is still possible, can you direct me on how to do it? If not, please let me know if/when this functionality will be available again!

Kosinkadink commented 1 year ago

The last img2img example is outdated and kept from the original repo (I put a TODO: replace this), but img2img still works. I'll post an example for you here in a bit, I'm currently working on a big feature that is eating up my time.

I will also update the README with updated workflows including for img2img options, hopefully within 36 hours if all goes to plan.

DevouredByBeef commented 1 year ago

I found my way around it by using the Repeat Latent Batch node to act as the frame count for a single encoded latent of the source image:

Although be aware the less denoising you apply at the sampler stage, the less percieved movement you'll see in the result (besides general noise). If you're attempting to simply add motion to an existing image, you'll need high enough denoising and a matching prompt of the subject itself.

I do wonder if the init_latent from the original ArtVentureX repo functions in a similiar way.

Kosinkadink commented 1 year ago

Yep, ArtVentureX repo took the init_latent and just cloned the first latent the amount of times as the requested animation length:

The original animatediff repo's implementation (guoyww) of img2img was to apply an increasing amount of noise per frame at the very start. I'll soon have some extra nodes to help customize applied noise. And I will also add documentation for using tile and inpaint controlnets to basically do what img2img is supposed to be.

Skrownerve commented 1 year ago

Thanks everyone!

Scholar01 commented 1 year ago

@Kosinkadink I aspire to have it reference the initial image, subsequently undergoing gradual alterations. How might this be accomplished? For instance, utilizing the final frame of the initially generated git as a starting point to regenerate the video, followed by amalgamating them within a video editing software - surely, the result would be quite spectacular! Do you have any insights to which I might refer?

Scholar01 commented 1 year ago

Should you proffer beneficial advice, I stand ready to contribute code as a labor force. ☺

Qualzz commented 11 months ago

I have hard time making this work with the BtachShedulePrompt node. Any success anyone ?

Kosinkadink commented 11 months ago

FizzNodes BatchSchedulePrompt was broken for a few days, but today's FizzNodes update fixed that, so try to update FizzNodes and try again!

Qualzz commented 11 months ago

FizzNodes BatchSchedulePrompt was broken for a few days, but today's FizzNodes update fixed that, so try to update FizzNodes and try again!

Still the same. I think maybe my setup is wrong or I don't know. Maybe using an image as input weights too much and doesn't allow a lot of freedom, even with high denoising.

ghost commented 11 months ago

Repeat Latent Batch works decently. So I wonder if the way AnimateDiff works allows for the first frame to be 0% noise, with the rest being 100% and still remain temporaly consistent. I think it's safe to assume that's not possible since no implementation has set it up that way. I think what would actually happen is the animation would simply instantly switch to a different scene on the second and subsequent frames, which is a shame.

drschwabe commented 10 months ago

Repeat Latent Batch does not seem to work with anything above 16 (edit: when controlnet is used)

Can create 48 batch_size animation using when using the Repeat Latent Batch but when I add controlnet (scribbly) it crashes if I set that above 16. Or is this a performance / VRAM issue?

Error occurred when executing KSampler:

Control type ControlNet may not support required features for sliding context window; use Control objects from Kosinkadink/Advanced-ControlNet nodes, or make sure Advanced-ControlNet is updated.

....

and the console:


got prompt
[AnimateDiffEvo] - INFO - Loading motion module mm-Stabilized_mid.pth
[AnimateDiffEvo] - INFO - Using fp16, converting motion module to fp16
[AnimateDiffEvo] - INFO - Sliding context window activated - latents passed in (30) greater than context_length 16.
[AnimateDiffEvo] - INFO - Injecting motion module mm-Stabilized_mid.pth version v1.
Requested to load BaseModel
Loading 1 new model
  0%|                                                                                                                                                                          | 0/30 [00:00<?, ?it/s]
[AnimateDiffEvo] - INFO - Ejecting motion module mm-Stabilized_mid.pth version v1.
[AnimateDiffEvo] - INFO - Cleaning motion module from unet.
[AnimateDiffEvo] - INFO - Removing motion module mm-Stabilized_mid.pth from cache
ERROR:root:!!! Exception during processing !!!
ERROR:root:Traceback (most recent call last):
  File "/home/bob/software/ComfyUI/execution.py", line 153, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)

Kosinkadink commented 10 months ago

You need to use Load ControlNet Model (Advanced) from the Adv-ControlNet submenu.

Scholar01 commented 10 months ago

https://github.com/Scholar01/ComfyUI-Keyframe

drschwabe commented 10 months ago

https://github.com/Scholar01/ComfyUI-Keyframe

Looks promsing, but after installing this one am getting error about missing 'rich' module

edit: was able to get past the error by insalling rich module globally pip install rich

Kosinkadink commented 10 months ago

@Scholar01 ah, you made node to do init image injection during sampling, including the sigma generation for samplers to select denoise for individual latents? Nice! And in the style of the Advanced-ControlNet keyframes, very cool!

I recently pushed a major refactor to the Advanced-ControlNet repo, and one of the things I added was a fix to a hidden bug with the way I dealt with the keyframes. Basically, the TL;DR is the KeyframeGroup should be cloned (a reference to new object returned, and filled with the same keyframes), otherwise, if you were to edit the values of the batch_index (or whatever acts like the 'key' for the Group) between pressing Queue prompt, the previous Keyframes with different key values than now would still be included until ComfyUI is restarted. I can make a PR for your nodes real quick to put that change in, to prevent that issue. Relevant lines here in Advanced-ControlNet code: https://github.com/Kosinkadink/ComfyUI-Advanced-ControlNet/blob/main/control/control.py#L125 https://github.com/Kosinkadink/ComfyUI-Advanced-ControlNet/blob/main/control/latent_keyframe_nodes.py#L39

I did not have the time to look into editing the sigmas for individual frame denoise control yet, I appreciate you looking into it! Would you mind if at some point, I use some of your code for inspiration? It might come in handy for certain sliding context stuff I've been exploring, and eventually including that sort of denoise control for certain options in this repo. If I do integrate that/something similar, having it be part of this repo would also prevent possible issues of the two interfering with one another. If I do use any inspired code, I'll add comments into the code to credit you properly. I noticed your repo doesn't have a license, so just wanted to double check with you. Thanks for the comment!

Scholar01 commented 10 months ago

@Kosinkadink Ah, I apologize for taking so long to get back to you! I've been so busy lately.

I've been referencing your code a lot. So I'm honored that you're integrating this! Please feel free to use Keyframe's code!

Thanks again for your contribution!

umxprime commented 10 months ago

And I will also add documentation for using tile and inpaint controlnets to basically do what img2img is supposed to be.

@Kosinkadink Hi, I tried to figure out such kind of workflow but from now the best result I could get was to use IPAdapter to keep general tone consistency through the whole animation + unmasked Inpaint contronet to have proper start/end inference. Results by using tile + inpaint didn't gave the best output so far (blurrier and over saturated colours). Any tips to combine CNs while keeping most of the details and colours from img inputs ?

Kosinkadink commented 2 months ago

There are a few tools now for im2img stuff (PIA, AnimateLCM-I2V), so closing this issue as it is very stale now.

Kosinkadink / ComfyUI-AnimateDiff-Evolved

How to do img2img with this? #12