the video mouth shape is the same with the reference?

tencent-ailab / V-Express

V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.

2.26k stars 281 forks source link

If you set retarget_strategy as "no_retarget", it is highly recommended to use reference_attention_weight >2.
python inference.py \ --reference_image_path "./test_samples/short_case/AOC/ref.jpg" \ --audio_path "./test_samples/short_case/AOC/v_exprss_intro_chattts.mp3" \ --kps_path "./test_samples/short_case/AOC/AOC_raw_kps.pth" \ --output_path "./output/short_case/talk_AOC_raw_kps_chattts_no_retarget.mp4" \ --retarget_strategy "fix_face" \ --num_inference_steps 25 \ --reference_attention_weight 1.0 \ --audio_attention_weight 3.0 \ --save_gpu_memory

Generally, if the front view reference and kps are used, naive_retarget works better. note, audio_attention_weight sets to 1.0.

tencent-ailab / V-Express

the video mouth shape is the same with the reference? #29