continue-revolution / sd-webui-animatediff

AnimateDiff for AUTOMATIC1111 Stable Diffusion WebUI
Other
2.94k stars 249 forks source link

[Bug]: Img2img is non functionnal #96

Open Mozoloa opened 9 months ago

Mozoloa commented 9 months ago

Is there an existing issue for this?

Have you read FAQ on README?

What happened?

When trying img2img, it's either outputing very weird artifacty results or kinda normal results but nothing moves and everything becomes deepfried at the very end.

A few examples 00015-1918510437 a wizard in a forest, portrait fantasy, Realistic Vision 5.1, 512x512, denoise 0.5, 8 frames (1s)

00016-1918510437 exact same thing but 16 frames (2s)

00017-1918510437 exact same thing but with full denoise, image is now very different

00018-1918510437 exact same thing but with 0.1 denoise

Both padding & batch cond uncond are checked in the settings, here are the settings of animediff image

Steps to reproduce the problem

  1. Go to img2img
  2. Input any image
  3. Use the settings above

What should have happened?

A normal animation, the likes of piklabs or runway gen2, like it actually does quite correctly in the txt2img tab

Commit where the problem happens

webui: 1.6.0 extension: 3be7c2396193dd825a18034644871853f195714c

What browsers do you use to access the UI ?

Brave

Command Line Arguments

--opt-sdp-attention --autolaunch

Console logs

usually just does this :

2023-09-17 18:33:44,876 - AnimateDiff - INFO - Restoring DDIM alpha.
2023-09-17 18:33:44,876 - AnimateDiff - INFO - Removing motion module from SD1.5 UNet input blocks.
2023-09-17 18:33:44,877 - AnimateDiff - INFO - Removing motion module from SD1.5 UNet output blocks.
2023-09-17 18:33:44,877 - AnimateDiff - INFO - Removing motion module from SD1.5 UNet middle block.
2023-09-17 18:33:44,877 - AnimateDiff - INFO - Removal finished.
2023-09-17 18:33:44,877 - AnimateDiff - INFO - Merging images into GIF.
2023-09-17 18:33:45,494 - AnimateDiff - INFO - AnimateDiff process end.
Total progress: 100%|██████████| 20/20 [00:21<00:00,  1.07s/it]t]

Additional information

No response

continue-revolution commented 9 months ago

I myself don't know the best config to do i2v. You may want to configurate latent power and latent scale, re-try and find the best config. For example, by having power==1 and scale==16, you may get something but the last ~4 frames will be something unrelated; by having power==0.5 and scale==5, you may get something you want, but I'm not sure.

Mozoloa commented 9 months ago

I've tried many combinations of settings but this doesn't seem to help. I'm surprised the base settings do not give at least a semi decent result and fiddling doesn't seem to improve things 00020-3406767663 00021-3406767663 00022-3406767663 00023-3406767663 00024-3406767663

Mozoloa commented 9 months ago

Actually, it looks like even txt2img is non functional now, on commit e06e2edeece3582a7a82063a6caa66d83d4ba557 It does the same thing, plus the generation is very dense and weird in terms of details:

https://github.com/continue-revolution/sd-webui-animatediff/assets/20817233/21259642-0dc9-47be-b6e3-76af92ae3e2f

_a beautiful woman Steps: 20, Sampler: DPM++ 2S a Karras, CFG scale: 7, Seed: 1929438715, Size: 512x512, Model hash: 15012c538f, Model: realisticVisionV51v51VAE, Version: 1.6.0 image

oliverban commented 9 months ago

I can confirm I have the same strange results in img2img and the "Last Frame" does not work at all, so I dont't know what is going on there.

But the txt2img one works fine, I just tried it and nothing has changed in this regard, but yeah, img2img doesn't seem to actually work in it's current form!

continue-revolution commented 9 months ago
  1. If you cannot do txt2img, then unfortunately I cannot do to much about it.
  2. If you can do txt2gif, but cannot do img2gif, you may want to write better prompts and re-configurate the power and scale. The "default" power and scale is just something I randomly posted. I don't have a "guaranteed" setting, and you will have to do some math and experiments.

This is something I have. No matter how many times I generate, the performance is at least not as bad as yours. image image 00031-3123842855

Also, you should understand that this feature was originally developed by a forked repo, not me, not the official AnimateDiff. So really no one can guarantee that this method can work in any case at all. Try on your own and figure out the best setting.

I personally do not run anything generating real human.

rkfg commented 9 months ago

The common thing in all the OP's generations is 8 frames. It's not the default, which is 16, so I'd suggest trying 16 frames. The models were trained on this number of frames, going below it can cause various artifacts.

Also, to compare the performance and see if it's something on your end, can you share the exact image you use as well as all the parameters you specify including seed, CFG, steps etc., just screenshot the entire page.

oliverban commented 9 months ago

The common thing in all the OP's generations is 8 frames. It's not the default, which is 16, so I'd suggest trying 16 frames. The models were trained on this number of frames, going below it can cause various artifacts.

Also, to compare the performance and see if it's something on your end, can you share the exact image you use as well as all the parameters you specify including seed, CFG, steps etc., just screenshot the entire page.

The second one says "exact same thing but 16 frames (2s)" so I guess it happens to OP at the same thing there.

OT: Do you have token mergin on? I have seen that if token merging is < 0 it will make it worse/artifact etc

Mozoloa commented 9 months ago

The common thing in all the OP's generations is 8 frames. It's not the default, which is 16, so I'd suggest trying 16 frames. The models were trained on this number of frames, going below it can cause various artifacts. Also, to compare the performance and see if it's something on your end, can you share the exact image you use as well as all the parameters you specify including seed, CFG, steps etc., just screenshot the entire page.

The second one says "exact same thing but 16 frames (2s)" so I guess it happens to OP at the same thing there.

OT: Do you have token mergin on? I have seen that if token merging is < 0 it will make it worse/artifact etc

No idea what this is so I probably never touched it. Recently I had great results with 16 frames, but img2img are still pretty low movement !

oliverban commented 9 months ago
  1. If you cannot do txt2img, then unfortunately I cannot do to much about it.
  2. If you can do txt2gif, but cannot do img2gif, you may want to write better prompts and re-configurate the power and scale. The "default" power and scale is just something I randomly posted. I don't have a "guaranteed" setting, and you will have to do some math and experiments.

This is something I have. No matter how many times I generate, the performance is at least not as bad as yours. image image 00031-3123842855 00031-3123842855

Also, you should understand that this feature was originally developed by a forked repo, not me, not the official AnimateDiff. So really no one can guarantee that this method can work in any case at all. Try on your own and figure out the best setting.

I personally do not run anything generating real human.

I'd love to do "some math" but what is the formula? ;)

The common thing in all the OP's generations is 8 frames. It's not the default, which is 16, so I'd suggest trying 16 frames. The models were trained on this number of frames, going below it can cause various artifacts. Also, to compare the performance and see if it's something on your end, can you share the exact image you use as well as all the parameters you specify including seed, CFG, steps etc., just screenshot the entire page.

The second one says "exact same thing but 16 frames (2s)" so I guess it happens to OP at the same thing there. OT: Do you have token mergin on? I have seen that if token merging is < 0 it will make it worse/artifact etc

No idea what this is so I probably never touched it. Recently I had great results with 16 frames, but img2img are still pretty low movement !

It's in settings under optimization. Yeah, also seems img2img just changes the input too much if you want anything to move!

Mozoloa commented 9 months ago

I just opened the webui, generated a random img, then send it straight to img2img and enabled animdiff

Input image

Output 00001-3503380599

Beautiful portrait of a beautiful young blonde girl in a park, photography analog <lora:epiCRealLife:1>, zoom out, dolly shot
Negative prompt: doll, uncanny valley, 3D, cgi, painting, drawing, asian
Steps: 20, Sampler: DPM++ 2M, CFG scale: 7, Seed: 3503380599, Face restoration: CodeFormer, Size: 512x640, Model hash: e48ca7f826, Model: epicphotogasm_v1, Denoising strength: 0.75, Lora hashes: "epiCRealLife: 1c049585eb7b", Version: 1.6.0

I tried to add some camera movements in the prompt thinking that maybe it'd help but the image is quite still

oliverban commented 9 months ago

I just opened the webui, generated a random img, then send it straight to img2img and enabled animdiff

Input image

Output 00001-3503380599 00001-3503380599

Beautiful portrait of a beautiful young blonde girl in a park, photography analog <lora:epiCRealLife:1>, zoom out, dolly shot
Negative prompt: doll, uncanny valley, 3D, cgi, painting, drawing, asian
Steps: 20, Sampler: DPM++ 2M, CFG scale: 7, Seed: 3503380599, Face restoration: CodeFormer, Size: 512x640, Model hash: e48ca7f826, Model: epicphotogasm_v1, Denoising strength: 0.75, Lora hashes: "epiCRealLife: 1c049585eb7b", Version: 1.6.0

I tried to add some camera movements in the prompt thinking that maybe it'd help but the image is quite still

Interesting, i tried the same and literally got nothing, looks like crap and this is true for every type of thing I send there (tried with ANIME here just because that was the example)

brave_toOQ7F8KBR

continue-revolution commented 9 months ago

@oliverban https://github.com/continue-revolution/sd-webui-animatediff#img2gif

Mozoloa commented 9 months ago

I'm guessing you need to find a sweet spot of denoising.

From 0.8 to 1 00006-3503380599 00005-3503380599 00003-3503380599

continue-revolution commented 9 months ago

you can give me your init image and prompts if you want to, but you may wait for a couple days before I try, because I’m updating something else and I have tons of real world stuff.

oliverban commented 9 months ago

@oliverban https://github.com/continue-revolution/sd-webui-animatediff#img2gif

Thanks I know this already. I found out what was going on.

00028-4244312984

FIX: Disable "Always discard next-to-last sigma" and raise your "Noise multiplier for img2img" to 1.0!! This was causing my issues!!!

xyzDist commented 9 months ago

I'm guessing you need to find a sweet spot of denoising.

From 0.8 to 1 00006-3503380599 00006-3503380599 00005-3503380599 00005-3503380599 00003-3503380599 00003-3503380599

Hey, just want to reply thank you for your test! I have the same issue as img2img animateDiff just not working, now realize is we needs the denoise to be high value! but so high it's not the same as my init image.... that's the issue. but at least it is working now, got it to move and as expected!

LIQUIDMIND111 commented 9 months ago
  1. If you cannot do txt2img, then unfortunately I cannot do to much about it.
  2. If you can do txt2gif, but cannot do img2gif, you may want to write better prompts and re-configurate the power and scale. The "default" power and scale is just something I randomly posted. I don't have a "guaranteed" setting, and you will have to do some math and experiments.

This is something I have. No matter how many times I generate, the performance is at least not as bad as yours. image image 00031-3123842855 00031-3123842855

Also, you should understand that this feature was originally developed by a forked repo, not me, not the official AnimateDiff. So really no one can guarantee that this method can work in any case at all. Try on your own and figure out the best setting.

I personally do not run anything generating real human.

https://youtube.com/shorts/NLt6HqUgBFM

Works well for me with the 1.9.0 update, NOT updating to 1.9.1 yet, but just wanted to know, how can we avoid these watermarks from Shutterstock? even on v2 model i get them,. look video....... https://youtube.com/shorts/NLt6HqUgBFM

LIQUIDMIND111 commented 9 months ago

https://youtube.com/shorts/NLt6HqUgBFM

continue-revolution commented 9 months ago

@LIQUIDMIND111 This is because the dataset they use have such problem. It is the model’s problem.

LIQUIDMIND111 commented 9 months ago

@LIQUIDMIND111 This is because the dataset they use have such problem. It is the model’s problem.

besides that, did you like the video overall? looks well with this model epicphotogasm but with CUSTOM models from me, it will show bad errors SAME parameters and all....

LIQUIDMIND111 commented 9 months ago

@LIQUIDMIND111 This is because the dataset they use have such problem. It is the model’s problem.

thanks for being attentive to us.... we appreciate that. where is your donate link?

continue-revolution commented 9 months ago

@LIQUIDMIND111 Since it is trained with some specific checkpoints, it might not work for all checkpoints. That said, Anything V5 Ink is not something they officially demo-ed, but it worked suprisingly well with reasonable prompt engineering. So I believe that you probably need to do more prompt engineering before you conclude that it is not working for some specific checkpoint.

You can donate via https://github.com/continue-revolution/sd-webui-animatediff#sponsor / https://www.patreon.com/conrevo / https://ko-fi.com/conrevo I don't know if that paypal QR code is working properly. If not, please let me know.

LIQUIDMIND111 commented 9 months ago

@LIQUIDMIND111 Since it is trained with some specific checkpoints, it might not work for all checkpoints. That said, Anything V5 Ink is not something they officially demo-ed, but it worked suprisingly well with reasonable prompt engineering. So I believe that you probably need to do more prompt engineering before you conclude that it is not working for some specific checkpoint.

You can donate via https://github.com/continue-revolution/sd-webui-animatediff#sponsor / https://www.patreon.com/conrevo / https://ko-fi.com/conrevo I don't know if that paypal QR code is working properly. If not, please let me know.

thanks, any other list of recommended working checkpoints?

continue-revolution commented 9 months ago

https://animatediff.github.io/ This page list all checkpoints they tested, but you can try more.

Stefanobeck commented 9 months ago

I have same problem in im2img, do you have solution?

genialgenteel commented 8 months ago

Hey, just want to reply thank you for your test! I have the same issue as img2img animateDiff just not working, now realize is we needs the denoise to be high value! but so high it's not the same as my init image.... that's the issue.

Is there any way this could be modified so that it works with lower denoising? This is kind of the problem with imgtoGIF.... It's not really imgtoGIF if your first frame ends up being totally different than your initial image because the denoising is 0.8 or 1.... It may as well be a different image. Like in the realistic example above, the girl looks like a different person. There's always ReActor to fix that, of course, but...it still changes the original colors and composition of the initial image because the denoising blurs it too much.

If nothing can be done, oh well. But it would be nice if there was some way to make imgtoGIF work with low denoising so that the first frame and output GIF look more like the image you uploaded on the imgtoimg tab. 🙏🏽

muletmiles commented 8 months ago

Hmm, one thing you can do is change the denoising strength depending on the frame.

The following is the input image: monkGirl_big

And then I gradually raise the denoising strength to 1, then back to 0. The result:

00135-3134603515_orig

While this method allows for the start and end frame to be exactly the input picture, there is a bit of temporal inconsistency (e.g., the colors). From what I can tell, the model is not actually trying to make the middle frames look like the start and end frames. This would be nicer, but I don't know if it's possible without significant changes.

code change (VERY HACKY):

            [in mm_cfg_forward in animatediff_infv2v.py]

            [...]

            assert not is_edit_model or all(len(conds) == 1 for conds in conds_list), "AND is not supported for InstructPix2Pix checkpoint (unless using Image CFG scale = 1.0)"

            if self.mask_before_denoising and self.mask is not None:
                x = self.init_latent * self.mask + self.nmask * x

           #MILES CHANGE
            milesNumber = (1, 0.9,0.8,0.7,0.6,0.5,0.4,0.3,0.25,0.2,0.15,0.1,0.09,.08,.07,.06,.05,.04,.03,.02,.01, 0, 0, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0, 0, 0, 0, 0, 0, 0)

            for i in range(x.shape[0]):
                if (self.step / state.sampling_steps) < milesNumber[i]: 
                    x[i, :, :, :] = self.init_latent[i, :, :, :]

            #END MILES CHANGE

            batch_size = len(conds_list)
            repeats = [len(conds_list[i]) for i in range(batch_size)]

            [...]
continue-revolution commented 8 months ago

@muletmiles please submit a pull request along with your description. You are making a good point, and I need to pay attention to it. It is not necessary that your PR works / neat / anything, but I will make sure to investigate how I can improve. Thanks!

muletmiles commented 8 months ago

Thanks, will do tomorrow (1am local time here, going to sleep)!

On Sun, Nov 12, 2023 at 1:02 AM Chengsong Zhang @.***> wrote:

@muletmiles https://github.com/muletmiles please submit a pull request along with your description. You are making a good point, and I need to pay attention to it. It is not necessary that your PR works / neat / anything, but I will make sure to investigate how I can improve. Thanks!

— Reply to this email directly, view it on GitHub https://github.com/continue-revolution/sd-webui-animatediff/issues/96#issuecomment-1807024414, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADKST3QAY56SEPXKD6SP4ITYEBYATAVCNFSM6AAAAAA43XBN2CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBXGAZDINBRGQ . You are receiving this because you were mentioned.Message ID: @.***>

1manfactory commented 2 months ago

FIX: Disable "Always discard next-to-last sigma" and raise your "Noise multiplier for img2img" to 1.0!! This was causing my issues!!!

Had the same issue. It was setting "Noise multiplier for img2img" to 1 to solve this problem. Thanks