Getting different results

If you use the same person's reference image and target video, you need to execute it using the following script.

python inference.py \
    --reference_image_path "./test_samples/emo/talk_emotion/ref.jpg" \
    --audio_path "./test_samples/emo/talk_emotion/aud.mp3" \
    --kps_path "./test_samples/emo/talk_emotion/kps.pth" \
    --output_path "./output/test/talk_emotion_aud_result_0.95_2.0.mp4" \
    --retarget_strategy "no_retarget" \         # no need to retarget
    --reference_attention_weight 0.95 \
    --audio_attention_weight 2.0 \
    --num_inference_steps 25

https://github.com/tencent-ailab/V-Express/assets/19601425/be7d869f-08b1-46b8-b389-b5235ac7221e

If you want to use another person's video as a target, you need to choose a video that is closer to the reference image, such as talk_emotion's reference image and talk_hb's target video, and the audio still uses talk_emotion. Then you need to execute it with the following script

python inference.py \
    --reference_image_path "./test_samples/emo/talk_emotion/ref.jpg" \
    --audio_path "./test_samples/emo/talk_emotion/aud.mp3" \
    --kps_path "./test_samples/emo/talk_hb/kps.pth" \     # change this
    --output_path "./output/test/talk_emotion_aud_hb_result_0.95_2.0.mp4" \
    --retarget_strategy "naive_retarget" \      # need retarget
    --reference_attention_weight 0.95 \
    --audio_attention_weight 2.0 \
    --num_inference_steps 25

https://github.com/tencent-ailab/V-Express/assets/19601425/6377cd28-acc0-4eff-95ec-edc761702680

tencent-ailab / V-Express

Getting different results #8