ali-vilab / UniAnimate

Code for Paper "UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation".
https://unianimate.github.io/
561 stars 35 forks source link

add noise scheduler #40

Open ak01user opened 6 days ago

ak01user commented 6 days ago

Hi,I try to finetune the model on my own dataset, but I am not very familiar with the diffusion model's trainning. I would like to ask where I can get the method/scheduler of adding noise in the code, because I noticed that the pred noise after denoise unet will go through complex calculations to get the predicted vae features of the video frame. image

wangxiang1230 commented 6 days ago

Hi, thanks for your attention. You can refer to this loss function (https://github.com/ali-vilab/UniAnimate/blob/549ee5fad7618500790929b0ae73151d36649045/tools/modules/diffusions/diffusion_ddim.py#L381) for more details.

ak01user commented 5 days ago

hi, the model_kwargs look like model_kwargs of inference, can you tell me what x0 is, is gt_frame's vae encode_features?such as 1,sq,4,96,64.

wangxiang1230 commented 4 days ago

Hi, sorry for the late reply. x0 means the original clean video vae latents. t is the timestep.

ak01user commented 4 days ago

it does not matter,you are so kind,there are good news that I did fine-tune on my own dataset,one person's dancing video with 340 frames.I am just training ['local_image_embedding','local_image_embedding_after'] two blocks' parameters.Looking forward to good results. thanks very much.

ak01user commented 19 hours ago

hi,I tried to fine-tune a specific person's data, but the result was not very good. How should I modify it?I notice that the loss_type is 'mse', and var_type is 'fixed_small',is that right?