Can Your Current Implementation Synthesize Lip Movement Using Audio As Input?

astorfi / lip-reading-deeplearning

:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures

Apache License 2.0

1.84k stars 323 forks source link

Can Your Current Implementation Synthesize Lip Movement Using Audio As Input? #6

Closed MXGray closed 6 years ago

MXGray commented 7 years ago

Thanks for your awesome work! I just want to clarify if your current implementation can produce lip movement video output from audio input? i.e. Audio to lip movement video. Or, does your code need to be modified to do this? Please advise. Thanks again! :)

astorfi commented 6 years ago

@MXGray Thanks for the kind words ... Ideally, if the error would be zero (which is not!), it should be able to do it. So basically, the design is for matching purposes and not the generation. Possibly, using GANs (Generative adversarial networks) may do your desired task in a better way.

Bests