DinoMan / speech-driven-animation

947 stars 289 forks source link

Value Error: Type must be a sub-type of ndarray type #1

Closed zhumazik closed 5 years ago

zhumazik commented 5 years ago

Hi, guys! Thanks for sharing the code!

I'm playing with it right now. I have an error though in sda.py:

`ValueError Traceback (most recent call last)

in () ----> 1 vid, aud = va("image.bmp",audio_clip, fs=fs) /content/sda.py in __call__(self, img, audio, fs, aligned) 240 speech = speech.view(-1, 1) 241 else: --> 242 speech = audio.view(-1, 1) 243 244 frame = self.img_transform(frame).to(self.device) ValueError: Type must be a sub-type of ndarray type` my variable audio is a numpy array converted from your sample file with scipy.io.wavfile (I don't use torchaudio since it's not supported in my environment) What can be wrong?
DinoMan commented 5 years ago

Yup I had not tested a branch of the code which you seem to have taken. Pull the latest changes from master and let me know if it works now.

zhumazik commented 5 years ago

Thanks! Method call requires "audio" variable as a torch tensor object. So I took your new line and put it outside.

`import numpy as np import torch import scipy.io.wavfile as wav from PIL import Image

va = sda.VideoAnimator(gpu=-1) fs, audio_clip = wav.read("audio.wav") still_frame = np.array(Image.open("image.bmp"))

audio_clip = torch.from_numpy(audio_clip).float() vid, aud = va(still_frame, audio_clip, fs=fs) `

DinoMan commented 5 years ago

Thanks! Method call requires "audio" variable as a torch tensor object. So I took your new line and put it outside.

Good to know it works for you now. However I'm pretty sure that it should work for numpy array since when I tested it yesterday i did not provide a tensor. Can you check that you are up to date with master. I am asking because ideally I would want it to take numpy arrays as input since more people are familiar with this format of data.