DinoMan / speech-driven-animation

947 stars 289 forks source link

Strange results #14

Closed alexsannikoff closed 5 years ago

alexsannikoff commented 5 years ago

Hello! I try to run this repo on test data in examples. For this, I run this code:

import sda
import scipy.io.wavfile as wav
from PIL import Image
import numpy as np

va = sda.VideoAnimator(gpu=0)# Instantiate the aminator
fs, audio_clip = wav.read("./example/audio.wav")
frame = Image.open("./example/image.bmp")
frame = np.array(frame)
vid, aud = va(frame, audio_clip, fs=fs)
va.save_video(vid, aud, "generated.mp4")

But get this error:

Traceback (most recent call last):
  File "demo.py", line 5, in <module>
    va = sda.VideoAnimator(gpu=0, model_path="crema")# Instantiate the aminator
  File "/2tbsata/Research/sannikovalexey/speech-driven-animation/sda/sda.py", line 111, in __init__
    flip_input=False)
  File "/home/nsuser/Environments/speechdrivenanimation/lib/python3.5/site-packages/face_alignment/api.py", line 95, in __init__
    self.face_alignment_net.load_state_dict(fan_weights['state_dict'])
  File "/home/nsuser/Environments/speechdrivenanimation/lib/python3.5/site-packages/torch/nn/modules/module.py", line 777, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for FAN:
    Missing key(s) in state_dict: "conv1.weight", "conv1.bias", "bn1.running_mean", "bn1.running_var", "bn1.weight", "bn1.bias", "conv2.bn1.running_mean", "conv2.bn1.running_var", "conv2.bn1.weight", "conv2.bn1.bias", "conv2.conv1.weight", "conv2.bn2.running_mean", "conv2.bn2.running_var", "conv2.bn2.weight", "conv2.bn2.bias", "conv2.conv2.weight", "conv2.bn3.running_mean", "conv2.bn3.running_var", "conv2.bn3.weight", "conv2.bn3.bias", "conv2.conv3.weight", "conv2.downsample.0.running_mean", "conv2.downsample.0.running_var", "conv2.downsample.0.weight", "conv2.downsample.0.bias", "conv2.downsample.2.weight", "conv3.bn1.running_mean", "conv3.bn1.running_var", ...

After I change state_dict in api.py in face_alignment like this:

new_state_dict = OrderedDict()
for k, v in fan_weights['state_dict'].items():
    name = k.replace('module.', '')  # remove module.
    new_state_dict[name] = v
fan_weights['state_dict'] = new_state_dict

I can run demo and get this result: generated_aligned.mp4 What can I do wrong?

alexsannikoff commented 5 years ago

I try to set flag aligned=True in: vid, aud = va(frame, audio_clip, fs=fs, aligned=True) and get true generated sample: generated_notaligned Maybe, do anyone face with troubles in face_alignment?

DinoMan commented 5 years ago

It seems like the alignment process is not working correctly (possibly due to not having the correct model). That blob you are seeing is what the bad alignment has produced. You should maybe try installing the face alignment library from source (https://github.com/1adrianb/face-alignment) and if this doesn't work then ask a question on the FAN github repo and see what could be causing this issue.

The aligned=True flag will work if the face is aligned (which is the case for the example) but it will fail if you use any image.

berkeleymalagon commented 5 years ago

I've had similar difficulties. I tried 'aligned' set to both True and False, and I've tried different levels of zoom on multiple faces. I haven't been able to get a good result with anything other than the example image.

Here's an example of an image and the output with and without alignment: https://imgur.com/a/GZVBQXo

DinoMan commented 5 years ago

This is mentioned in issue #2. Like I said there the models released are from the grid, timit and crema datasets. These datasets will not generalize well to in the wild photos ( they will work for unseen faces from the same datasets) because they have been trained on too few identities (15 for grid 60 for crema).

For in the wild images you need the lrw pretrained model, which has not been released yet. I will post an update once this is available. That having said if you want to try with one of the existing models your best bet is with the crema dataset which has seen more identities during training.

berkeleymalagon commented 5 years ago

Thanks for the context. Any estimate on when you'll release the lrw pretrained model? Thanks for sharing this great work!

alexsannikoff commented 5 years ago

@DinoMan Thanks for the answer, I will try to install library from source.

alexsannikoff commented 5 years ago

Install from source work correct!