DinoMan / speech-driven-animation

947 stars 289 forks source link

[RESULTS] Generalization to different faces and language #13

Open bimunlp opened 5 years ago

bimunlp commented 5 years ago

This is what I got when using the provided image and audio test 1 with crema the same image but with different audio (Chinese) test 2 with crema when I transfer it to other images, the results turn out to be very disappointing, after a lot of tests, i obtained a better result using timit, but still unnatural. test 8 with timit

DinoMan commented 5 years ago

That seems about right. Like I said in another issue the videos generated using timit, crema and grid do not generalize as well since they have only seen 15 to 60 faces. Also you should consider that most of those datasets don't have a single Asian face in the training sets so it will be extra hard for Asian faces. You need the lrw model for this.

Also since the datasets used for training are all English I do not expect it to work on different languages very well.

At the moment I am still on vacation. Once I'm back I'll look into solving issues regarding hosting the models (the demand is high so I have maxed out the free git lfs quotas). After that i also need to discuss with the rest of the team about the release of the lrw model. Once I have an update on this I will let you all know

bimunlp commented 5 years ago

A great job has been done! Thank you so much for your inspiring work.

ustc-baize commented 4 years ago

@yiyouls
hello,i meet the question va = sda.VideoAnimator(gpu=0) # Instantiate the animator has been running for over an hour. it keeps showing "Downloading the face detection CNN. Please wait..." and result nothing else. There is noting wrong with my GPU, would you tell me how to solve this problem?