zhangchenxu528 / FACIAL

FACIAL: Synthesizing Dynamic Talking Face With Implicit Attribute Learning. ICCV, 2021.
GNU Affero General Public License v3.0
376 stars 83 forks source link

替换音频文件后报错 #5

Closed h310558606 closed 3 years ago

h310558606 commented 3 years ago

我将obama2.wav 替换成了我自己的音频文件test.wav,并做了文件处理,生成了test.pkl, 但是在执行后续步骤的时候报错。 想请教一下,目前的demo代码支持自己替换音频文件吗?如果要替换成自己的音频文件,需要做哪里变更呢,谢谢!

zhangchenxu528 commented 3 years ago

Our demo supports the use of different audio files, and other students have successfully run it. Please tell me where and what was the error?

h310558606 commented 3 years ago

When I run "! python test_video.py --test_id_name obama2 --name train3 --model pose2vid --dataroot ./datasets/train3/ --which_epoch latest --netG local --ngf 32 --label_nc 0 --n_local_enhancers 1 --no_instance --resize_or_crop resize"

I got the error as shown below: 1 2 ... 288 289 290 Traceback (most recent call last): File "test_video.py", line 40, in for i, data in enumerate(dataset): File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 521, in next data = self._next_data() File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1203, in _next_data return self._process_data(data) File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1229, in _process_data data.reraise() File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 425, in reraise raise self.exc_type(msg) IndexError: Caught IndexError in DataLoader worker process 0. Original Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/content/FACIAL/face2vid/data/aligned_pair_dataset.py", line 80, in getitem a = torch.FloatTensor([blink[i]]) IndexError: index 9 is out of bounds for axis 0 with size 9

zhangchenxu528 commented 3 years ago

可能是blink 文件路徑錯誤了。 --test_id_name obama2,請修改這裏確保和之前生成的npz名字一致(路徑是examples/test-result/),比如obama2.npz。

h310558606 commented 3 years ago

谢谢,问题解决了。还有两个问题想请教一下: 1.使用中文的语音,发现嘴形对的不是很好,但是应该不是因为训练集是英文的原因,因为我在另一个项目中,也使用了英文数据集训练的模型,发现嘴形对的就比较自然。总感觉奥巴马平时说话的时候这个嘴就很奇怪,能不能换一个人呢,不喜欢看他,哈哈哈。 2.更换reference video后的处理代码能不能提供一下呢,感觉这个也是大家很关心的问题。 十分感谢你们给大家开源的好代码。

zhangchenxu528 commented 3 years ago

中文语音嘴型问题我也发现了,我认为可能是由于1,deepspeech提取的feature本质上是字母,所以可能对中文不友好; 2,我的训练集由于全都是英文的,所以也有可能是训练数据问题。

我会在这个月月底或者下月初提供一个新的demo用于新video,包括数据处理,finetuning训练等。谢谢你的关注。

h310558606 commented 3 years ago

好的,谢谢,期待更新。

Xiaocong6 commented 2 years ago

请问 IndexError: index 9 is out of bounds for axis 0 with size 9 这个问题你是怎么解决的呀,我看了 --test_id_name和npz文件是一致的 @h310558606