Rudrabha / LipGAN

This repository contains the codes for LipGAN. LipGAN was published as a part of the paper titled "Towards Automatic Face-to-Face Translation".
http://cvit.iiit.ac.in/research/projects/cvit-projects/facetoface-translation
MIT License
578 stars 122 forks source link

CONTINIOUS PLAYING #33

Closed leonvit closed 3 years ago

leonvit commented 3 years ago

I want to create a chatbot that uses text to speech and lipgan for face animation. Is there a way that the lipgan can be used in real time and create something like a talking avatar that uses text to speech live? Any help will be valuable

Rudrabha commented 3 years ago

I guess real-time is a subjective term and it varies for each problem. For LipGAN, we will need at least a 350 ms audio window and an identity frame with the target head-pose for the model to work. So as long as you can feed a 350 ms audio window and a face with the target pose continuously the model is good to go in theory. The output from the model itself takes some time to generate. The major time-consuming steps that can be avoided in your case are -

  1. Face detection. You can provide an already cropped face to the model.
  2. You should also load the weights only once and then continuously feed the audio chunks.
  3. Some sort of a data buffer can be created wherein you feed the audio and the face in batches, i.e. while one batch is being generated, you create the next batch.

All these can reduce the time taken to generate the output significantly.

leonvit commented 3 years ago

@Rudrabha thank you very much for you're response in such a short time

chikiuso commented 3 years ago

@Leonthegamer12 I built one , feel free to have a look : https://www.twitch.tv/aipictures

leonvit commented 3 years ago

@Leonthegamer12 I built one , feel free to have a look : https://www.twitch.tv/aipictures

@chikiuso Sorry but is there a way that you can share the code with me. There's a very similar project im working on and it would be very helpful.

Rudrabha commented 3 years ago

@chikiuso that is really cool :) Does it use LipGAN?

leonvit commented 3 years ago

@chikiuso that is really cool :) Does it use LipGAN?

@chikiuso of course and you're source code would be very helpful

chikiuso commented 3 years ago

@Rudrabha thanks! and thank for your great work. I did use LipGAN in this twitch channel , I will cite your great project in my website ai.pictures as twitch panel is too small to show the paper name :)

@Leonthegamer12 the twitch channel code is composed of several recent major advancements in nlp, voice synthesis and LipGAN etc, now the code is really quite in a mess, I may be open it after I tidy it up and write some documentation, as now the installation and setting of the server is now too too complicated.

Rudrabha commented 3 years ago

@chikiuso Thanks!

leonvit commented 3 years ago

@Rudrabha thanks! and thank for your great work. I did use LipGAN in this twitch channel , I will cite your great project in my website ai.pictures as twitch panel is too small to show the paper name :)

@Leonthegamer12 the twitch channel code is composed of several recent major advancements in nlp, voice synthesis and LipGAN etc, now the code is really quite in a mess, I may be open it after I tidy it up and write some documentation, as now the installation and setting of the server is now too too complicated.

I dont want the whole project. All i want is the part of the code that makes LipGan work in Real-Time as im having some problems there. I'll figure the rest by myself.