Closed zhanghongyong123456 closed 10 months ago
It seems like a face detection problem. I'm guessing you need to also change the flag --is_voxceleb2=False.
Please make sure that the other flags are consistent with the inference script.
It seems like a face detection problem. I'm guessing you need to also change the flag --is_voxceleb2=False.
Please make sure that the other flags are consistent with the inference script.
when i add this flag, --is_voxceleb2=False. i have this outputs:
https://github.com/soumik-kanad/diff2lip/assets/48466610/40837195-049d-4290-85e2-4e0a4813010d
This seems better.
(If your input video (face or body) was moving, then you can also try using the flags --sampling_input_type=gt --sampling_ref_type=gt for similar movement/expression. With the current flag values you are using it only manipulates the first frame of the video. )
my cmd :python generate.py --generate_from_filelist=0 --video_path=path\to*.mp4 --audio_path=path\to*.mp3 --out_path=path\to*.mp4 --attention_resolutions 32,16,8 --learn_sigma True --num_head_channels 64 --resblock_updown True --use_scale_shift_norm False --sampling_input_type=first_frame --sampling_ref_type=first_frame --timestep_respacing ddim25 --use_ddim True --sample_path=output_dir --nframes 5 --nrefer 1 --use_ref=True --use_audio=True --audio_as_style=True --save_orig=False --image_size 128
about input video:
about resoult video:
What is the reason for this