Hangz-nju-cuhk / Talking-Face-Generation-DAVS

Code for Talking Face Generation by Adversarially Disentangled Audio-Visual Representation (AAAI 2019)
MIT License
817 stars 173 forks source link

using 'audio' generates static video #10

Open sarah-zhu opened 6 years ago

sarah-zhu commented 6 years ago

Hi~ @Hangz-nju-cuhk When I use 'audio' as the test_type, I got the static video all the time. All the generated images are the same. However when I use 'video' as the test_type, it works fine. Do you have any idea for this? Thank you!

Hangz-nju-cuhk commented 5 years ago

Hi, we recently found out that the audio part is not runnable on Pytorch version >=0.4.0, please check your environment to be the same as ours (Pytorch 0.2.0 with python 2.7).

Hangz-nju-cuhk commented 5 years ago

@sarah-zhu Sorry that there was a mistake in our testing data loading code, we have now fixed it. Thank you for pointing it out.

sarah-zhu commented 5 years ago

@Hangz-nju-cuhk Thanks for fixing it! Is the audio part runnable on Pytorch version >=0.4.0 now?

Hangz-nju-cuhk commented 5 years ago

@sarah-zhu there could be a bug on version >= 0.4.0 that needs fixing. We recommend using Pytorch version < 0.4.0 for now. I will be working on releasing the new version for Pytorch 0.4.1 in a month.

7aughing commented 5 years ago

sorry, when I run the testing script to generate videos from audio use the modified code, I still face the following mistake: Traceback (most recent call last): File "test_all.py", line 34, in test_folder = Test_VideoFolder(root=opt.test_root, A_path=A_path, config=opt) File "/home/dtt/workspace/Talking-Face-Generation-DAVS/Dataloader/Test_load_audio.py", line 92, in init self.vid = self.loader(self.root, self.A_path, config=self.config) File "/home/dtt/workspace/Talking-Face-Generation-DAVS/Dataloader/Test_load_audio.py", line 67, in Test_Outside_Loader raise ("mfccs = 0") TypeError: exceptions must be old-style classes or derived from BaseException, not str

7aughing commented 5 years ago

when I remove the code ` if require_audio:

ipdb.set_trace()

        k4 = 0
        for mfcc_num in pair:
            # for s in range(-1,2):
            mfcc_path = os.path.join(path, str(mfcc_num) + '.bin')
            if os.path.exists(mfcc_path):
                mfcc = np.fromfile(mfcc_path)
                mfcc = mfcc.reshape(20, 12)
                mfcc_block[k4, 0, :, :] = mfcc
                k4 += 1
            # else:
            #     raise ("mfccs = 0")`

in “Dataloader/Test_load_audio.py”, the code work!

Hangz-nju-cuhk commented 5 years ago

@duantiantian I am a bit confused for I just checked the code, it runs normally on my computer. The error means no audio files can be found, and you just removed the part for loading audio files, then how can it work? ><

sarah-zhu commented 5 years ago

@Hangz-nju-cuhk Thanks. I tested and it works on 0.4.0. Another problem: does this model also work for sentence level audio inputs?

Hangz-nju-cuhk commented 5 years ago

@sarah-zhu I am not sure what do you mean by "sentence-level audio inputs", maybe our kind of method is segment-level audio inputs? We did crop the mfcc by time shifts to let it align with images, so if this process can be realised online in the dataloader, then it is possible to work with a sentence level mfcc feature.

7aughing commented 5 years ago

@Hangz-nju-cuhk Sorry, I just comment out "else: raise ("mfccs = 0")", I thought after loading all the audio files, the code will raise an error rather break the for loop when the next audio file does not exist.

Thank you very much for your reply!!! ^-^

sarah-zhu commented 5 years ago

@Hangz-nju-cuhk Hi, I tested your model on both LRW and Voxceleb datasets using provided mfcc extraction and face alignment code. However, I found the result is more accurate on Voxceleb dataset than on LRW. Is the provided model trained on Voxceleb dataset?

Hangz-nju-cuhk commented 5 years ago

@sarah-zhu I am glad that you did further experiments!

If I remember correctly, this model is trained on the training set of LRW only, so were the results provided in our paper and videos. However, we did further experiments to train our model on a small sample of Voxceleb (from the first 100 classes), and we did not find any difference in our testing results on Voxceleb or LRW comparing with before.

I think the main reason is that Voxceleb provides normally cleaner audio clips than LRW, and humans are less sensitive to long-term results in Voxceleb than shorter ones in LRW.

mohitsshah commented 5 years ago

I'm facing a similar issue. Using pytorch (0.2.0) and python 2.7. The only difference is that I'm not using CUDA. Can this work without CUDA?

ak9250 commented 5 years ago

@Hangz-nju-cuhk getting this error using audio in google drive bin files Traceback (most recent call last): File "test_all.py", line 34, in test_folder = Test_VideoFolder(root=opt.test_root, A_path=A_path, config=opt) File "/content/Talking-Face-Generation-DAVS/Dataloader/Test_load_audio.py", line 92, in init self.vid = self.loader(self.root, self.A_path, config=self.config) File "/content/Talking-Face-Generation-DAVS/Dataloader/Test_load_audio.py", line 67, in Test_Outside_Loader raise ("mfccs = 0") TypeError: exceptions must be old-style classes or derived from BaseException, not str

ZhengMengbin commented 4 years ago

@ak9250 @duantiantian modifiy raise ("mfccs = 0") to raise Exception("mfccs = 0") can fix the bug

zzzzhuque commented 4 years ago

@duantiantian I am a bit confused for I just checked the code, it runs normally on my computer. The error means no audio files can be found, and you just removed the part for loading audio files, then how can it work? ><

Hello, the problem is not code, the problem is command line change from python test_all.py --test_root ./0572_0019_0003/audio --test_type audio --test_audio_video_length 99 --test_resume_path ./checkpoints/101_DAVS_checkpoint.pth.tar to python test_all.py --test_root ./0572_0019_0003/audio --test_type audio --test_audio_video_length 98 --test_resume_path ./checkpoints/101_DAVS_checkpoint.pth.tar Because there are only 98 mfcc bin files