Open Mozoloa opened 9 months ago
I myself don't know the best config to do i2v. You may want to configurate latent power and latent scale, re-try and find the best config. For example, by having power==1 and scale==16, you may get something but the last ~4 frames will be something unrelated; by having power==0.5 and scale==5, you may get something you want, but I'm not sure.
I've tried many combinations of settings but this doesn't seem to help. I'm surprised the base settings do not give at least a semi decent result and fiddling doesn't seem to improve things
Actually, it looks like even txt2img is non functional now, on commit e06e2edeece3582a7a82063a6caa66d83d4ba557 It does the same thing, plus the generation is very dense and weird in terms of details:
_a beautiful woman
Steps: 20, Sampler: DPM++ 2S a Karras, CFG scale: 7, Seed: 1929438715, Size: 512x512, Model hash: 15012c538f, Model: realisticVisionV51v51VAE, Version: 1.6.0
I can confirm I have the same strange results in img2img and the "Last Frame" does not work at all, so I dont't know what is going on there.
But the txt2img one works fine, I just tried it and nothing has changed in this regard, but yeah, img2img doesn't seem to actually work in it's current form!
This is something I have. No matter how many times I generate, the performance is at least not as bad as yours.
Also, you should understand that this feature was originally developed by a forked repo, not me, not the official AnimateDiff. So really no one can guarantee that this method can work in any case at all. Try on your own and figure out the best setting.
I personally do not run anything generating real human.
The common thing in all the OP's generations is 8 frames. It's not the default, which is 16, so I'd suggest trying 16 frames. The models were trained on this number of frames, going below it can cause various artifacts.
Also, to compare the performance and see if it's something on your end, can you share the exact image you use as well as all the parameters you specify including seed, CFG, steps etc., just screenshot the entire page.
The common thing in all the OP's generations is 8 frames. It's not the default, which is 16, so I'd suggest trying 16 frames. The models were trained on this number of frames, going below it can cause various artifacts.
Also, to compare the performance and see if it's something on your end, can you share the exact image you use as well as all the parameters you specify including seed, CFG, steps etc., just screenshot the entire page.
The second one says "exact same thing but 16 frames (2s)" so I guess it happens to OP at the same thing there.
OT: Do you have token mergin on? I have seen that if token merging is < 0 it will make it worse/artifact etc
The common thing in all the OP's generations is 8 frames. It's not the default, which is 16, so I'd suggest trying 16 frames. The models were trained on this number of frames, going below it can cause various artifacts. Also, to compare the performance and see if it's something on your end, can you share the exact image you use as well as all the parameters you specify including seed, CFG, steps etc., just screenshot the entire page.
The second one says "exact same thing but 16 frames (2s)" so I guess it happens to OP at the same thing there.
OT: Do you have token mergin on? I have seen that if token merging is < 0 it will make it worse/artifact etc
No idea what this is so I probably never touched it. Recently I had great results with 16 frames, but img2img are still pretty low movement !
- If you cannot do txt2img, then unfortunately I cannot do to much about it.
- If you can do txt2gif, but cannot do img2gif, you may want to write better prompts and re-configurate the power and scale. The "default" power and scale is just something I randomly posted. I don't have a "guaranteed" setting, and you will have to do some math and experiments.
This is something I have. No matter how many times I generate, the performance is at least not as bad as yours.
![]()
![]()
![]()
![]()
Also, you should understand that this feature was originally developed by a forked repo, not me, not the official AnimateDiff. So really no one can guarantee that this method can work in any case at all. Try on your own and figure out the best setting.
I personally do not run anything generating real human.
I'd love to do "some math" but what is the formula? ;)
The common thing in all the OP's generations is 8 frames. It's not the default, which is 16, so I'd suggest trying 16 frames. The models were trained on this number of frames, going below it can cause various artifacts. Also, to compare the performance and see if it's something on your end, can you share the exact image you use as well as all the parameters you specify including seed, CFG, steps etc., just screenshot the entire page.
The second one says "exact same thing but 16 frames (2s)" so I guess it happens to OP at the same thing there. OT: Do you have token mergin on? I have seen that if token merging is < 0 it will make it worse/artifact etc
No idea what this is so I probably never touched it. Recently I had great results with 16 frames, but img2img are still pretty low movement !
It's in settings under optimization. Yeah, also seems img2img just changes the input too much if you want anything to move!
I just opened the webui, generated a random img, then send it straight to img2img and enabled animdiff
Input
Output
Beautiful portrait of a beautiful young blonde girl in a park, photography analog <lora:epiCRealLife:1>, zoom out, dolly shot
Negative prompt: doll, uncanny valley, 3D, cgi, painting, drawing, asian
Steps: 20, Sampler: DPM++ 2M, CFG scale: 7, Seed: 3503380599, Face restoration: CodeFormer, Size: 512x640, Model hash: e48ca7f826, Model: epicphotogasm_v1, Denoising strength: 0.75, Lora hashes: "epiCRealLife: 1c049585eb7b", Version: 1.6.0
I tried to add some camera movements in the prompt thinking that maybe it'd help but the image is quite still
I just opened the webui, generated a random img, then send it straight to img2img and enabled animdiff
Input
Beautiful portrait of a beautiful young blonde girl in a park, photography analog <lora:epiCRealLife:1>, zoom out, dolly shot Negative prompt: doll, uncanny valley, 3D, cgi, painting, drawing, asian Steps: 20, Sampler: DPM++ 2M, CFG scale: 7, Seed: 3503380599, Face restoration: CodeFormer, Size: 512x640, Model hash: e48ca7f826, Model: epicphotogasm_v1, Denoising strength: 0.75, Lora hashes: "epiCRealLife: 1c049585eb7b", Version: 1.6.0
I tried to add some camera movements in the prompt thinking that maybe it'd help but the image is quite still
Interesting, i tried the same and literally got nothing, looks like crap and this is true for every type of thing I send there (tried with ANIME here just because that was the example)
I'm guessing you need to find a sweet spot of denoising.
From 0.8 to 1
you can give me your init image and prompts if you want to, but you may wait for a couple days before I try, because I’m updating something else and I have tons of real world stuff.
@oliverban https://github.com/continue-revolution/sd-webui-animatediff#img2gif
Thanks I know this already. I found out what was going on.
FIX: Disable "Always discard next-to-last sigma" and raise your "Noise multiplier for img2img" to 1.0!! This was causing my issues!!!
I'm guessing you need to find a sweet spot of denoising.
Hey, just want to reply thank you for your test! I have the same issue as img2img animateDiff just not working, now realize is we needs the denoise to be high value! but so high it's not the same as my init image.... that's the issue. but at least it is working now, got it to move and as expected!
- If you cannot do txt2img, then unfortunately I cannot do to much about it.
- If you can do txt2gif, but cannot do img2gif, you may want to write better prompts and re-configurate the power and scale. The "default" power and scale is just something I randomly posted. I don't have a "guaranteed" setting, and you will have to do some math and experiments.
This is something I have. No matter how many times I generate, the performance is at least not as bad as yours.
![]()
![]()
![]()
![]()
Also, you should understand that this feature was originally developed by a forked repo, not me, not the official AnimateDiff. So really no one can guarantee that this method can work in any case at all. Try on your own and figure out the best setting.
I personally do not run anything generating real human.
https://youtube.com/shorts/NLt6HqUgBFM
Works well for me with the 1.9.0 update, NOT updating to 1.9.1 yet, but just wanted to know, how can we avoid these watermarks from Shutterstock? even on v2 model i get them,. look video....... https://youtube.com/shorts/NLt6HqUgBFM
@LIQUIDMIND111 This is because the dataset they use have such problem. It is the model’s problem.
@LIQUIDMIND111 This is because the dataset they use have such problem. It is the model’s problem.
besides that, did you like the video overall? looks well with this model epicphotogasm but with CUSTOM models from me, it will show bad errors SAME parameters and all....
@LIQUIDMIND111 This is because the dataset they use have such problem. It is the model’s problem.
thanks for being attentive to us.... we appreciate that. where is your donate link?
@LIQUIDMIND111 Since it is trained with some specific checkpoints, it might not work for all checkpoints. That said, Anything V5 Ink is not something they officially demo-ed, but it worked suprisingly well with reasonable prompt engineering. So I believe that you probably need to do more prompt engineering before you conclude that it is not working for some specific checkpoint.
You can donate via https://github.com/continue-revolution/sd-webui-animatediff#sponsor / https://www.patreon.com/conrevo / https://ko-fi.com/conrevo I don't know if that paypal QR code is working properly. If not, please let me know.
@LIQUIDMIND111 Since it is trained with some specific checkpoints, it might not work for all checkpoints. That said, Anything V5 Ink is not something they officially demo-ed, but it worked suprisingly well with reasonable prompt engineering. So I believe that you probably need to do more prompt engineering before you conclude that it is not working for some specific checkpoint.
You can donate via https://github.com/continue-revolution/sd-webui-animatediff#sponsor / https://www.patreon.com/conrevo / https://ko-fi.com/conrevo I don't know if that paypal QR code is working properly. If not, please let me know.
thanks, any other list of recommended working checkpoints?
https://animatediff.github.io/ This page list all checkpoints they tested, but you can try more.
I have same problem in im2img, do you have solution?
Hey, just want to reply thank you for your test! I have the same issue as img2img animateDiff just not working, now realize is we needs the denoise to be high value! but so high it's not the same as my init image.... that's the issue.
Is there any way this could be modified so that it works with lower denoising? This is kind of the problem with imgtoGIF.... It's not really imgtoGIF if your first frame ends up being totally different than your initial image because the denoising is 0.8 or 1.... It may as well be a different image. Like in the realistic example above, the girl looks like a different person. There's always ReActor to fix that, of course, but...it still changes the original colors and composition of the initial image because the denoising blurs it too much.
If nothing can be done, oh well. But it would be nice if there was some way to make imgtoGIF work with low denoising so that the first frame and output GIF look more like the image you uploaded on the imgtoimg tab. 🙏🏽
Hmm, one thing you can do is change the denoising strength depending on the frame.
The following is the input image:
And then I gradually raise the denoising strength to 1, then back to 0. The result:
While this method allows for the start and end frame to be exactly the input picture, there is a bit of temporal inconsistency (e.g., the colors). From what I can tell, the model is not actually trying to make the middle frames look like the start and end frames. This would be nicer, but I don't know if it's possible without significant changes.
code change (VERY HACKY):
[in mm_cfg_forward in animatediff_infv2v.py]
[...]
assert not is_edit_model or all(len(conds) == 1 for conds in conds_list), "AND is not supported for InstructPix2Pix checkpoint (unless using Image CFG scale = 1.0)"
if self.mask_before_denoising and self.mask is not None:
x = self.init_latent * self.mask + self.nmask * x
#MILES CHANGE
milesNumber = (1, 0.9,0.8,0.7,0.6,0.5,0.4,0.3,0.25,0.2,0.15,0.1,0.09,.08,.07,.06,.05,.04,.03,.02,.01, 0, 0, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0, 0, 0, 0, 0, 0, 0)
for i in range(x.shape[0]):
if (self.step / state.sampling_steps) < milesNumber[i]:
x[i, :, :, :] = self.init_latent[i, :, :, :]
#END MILES CHANGE
batch_size = len(conds_list)
repeats = [len(conds_list[i]) for i in range(batch_size)]
[...]
@muletmiles please submit a pull request along with your description. You are making a good point, and I need to pay attention to it. It is not necessary that your PR works / neat / anything, but I will make sure to investigate how I can improve. Thanks!
Thanks, will do tomorrow (1am local time here, going to sleep)!
On Sun, Nov 12, 2023 at 1:02 AM Chengsong Zhang @.***> wrote:
@muletmiles https://github.com/muletmiles please submit a pull request along with your description. You are making a good point, and I need to pay attention to it. It is not necessary that your PR works / neat / anything, but I will make sure to investigate how I can improve. Thanks!
— Reply to this email directly, view it on GitHub https://github.com/continue-revolution/sd-webui-animatediff/issues/96#issuecomment-1807024414, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADKST3QAY56SEPXKD6SP4ITYEBYATAVCNFSM6AAAAAA43XBN2CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBXGAZDINBRGQ . You are receiving this because you were mentioned.Message ID: @.***>
FIX: Disable "Always discard next-to-last sigma" and raise your "Noise multiplier for img2img" to 1.0!! This was causing my issues!!!
Had the same issue. It was setting "Noise multiplier for img2img" to 1 to solve this problem. Thanks
Is there an existing issue for this?
Have you read FAQ on README?
What happened?
When trying img2img, it's either outputing very weird artifacty results or kinda normal results but nothing moves and everything becomes deepfried at the very end.
A few examples
a wizard in a forest, portrait fantasy, Realistic Vision 5.1, 512x512, denoise 0.5, 8 frames (1s)
Both padding & batch cond uncond are checked in the settings, here are the settings of animediff![image](https://github.com/continue-revolution/sd-webui-animatediff/assets/20817233/bcd9bb08-317c-49c4-ad42-b659e9e8f9ee)
Steps to reproduce the problem
What should have happened?
A normal animation, the likes of piklabs or runway gen2, like it actually does quite correctly in the txt2img tab
Commit where the problem happens
webui: 1.6.0 extension: 3be7c2396193dd825a18034644871853f195714c
What browsers do you use to access the UI ?
Brave
Command Line Arguments
Console logs
Additional information
No response