tencent-ailab / V-Express

V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.
2.26k stars 283 forks source link

Some problems and conclusions - Eye movements / inconsistencies / quality #31

Open A-2-H opened 5 months ago

A-2-H commented 5 months ago

First of all, great work!

These are what I observed testing the program:

oisilener1982 commented 5 months ago

I tried this tutorial https://www.youtube.com/watch?v=ttEOIg9j2B4&t=327s and everything works fine but the quality of the output is really bad. Sadtalker is way better than v-express. I hope tencent can create something better than this

I'm here because i am looking for an alternative to sadtalker because devs might have abandoned it. It's just sad that this newer Ai by tencent has output quality issues.

johndpope commented 5 months ago

with "eyeballs doesn't move" - i think this logic from VASA - solves this problem https://github.com/johndpope/VASA-1-hack/blob/main/FaceHelper.py#L155

zhangjun001 commented 5 months ago

@oisilener1982 I guess there may be some errors in your settings. If you provide a front-facing photo and the face ratio meets the requirements, there should not be low-quality results. In addition, if a frontal video is provided as a v-kps reference sequence, the results will be more stable.

https://github.com/tencent-ailab/V-Express/assets/12435654/89437520-f422-47ac-b1bb-dd18e6297d34