替换音频文件后报错

h310558606 commented 3 years ago

我将obama2.wav 替换成了我自己的音频文件test.wav，并做了文件处理，生成了test.pkl, 但是在执行后续步骤的时候报错。想请教一下，目前的demo代码支持自己替换音频文件吗？如果要替换成自己的音频文件，需要做哪里变更呢，谢谢！

zhangchenxu528 commented 3 years ago

Our demo supports the use of different audio files, and other students have successfully run it. Please tell me where and what was the error?

h310558606 commented 3 years ago

When I run "! python test_video.py --test_id_name obama2 --name train3 --model pose2vid --dataroot ./datasets/train3/ --which_epoch latest --netG local --ngf 32 --label_nc 0 --n_local_enhancers 1 --no_instance --resize_or_crop resize"

I got the error as shown below: 1 2 ... 288 289 290 Traceback (most recent call last): File "test_video.py", line 40, in for i, data in enumerate(dataset): File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 521, in next data = self._next_data() File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1203, in _next_data return self._process_data(data) File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1229, in _process_data data.reraise() File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 425, in reraise raise self.exc_type(msg) IndexError: Caught IndexError in DataLoader worker process 0. Original Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/content/FACIAL/face2vid/data/aligned_pair_dataset.py", line 80, in getitem a = torch.FloatTensor([blink[i]]) IndexError: index 9 is out of bounds for axis 0 with size 9

zhangchenxu528 commented 3 years ago

可能是blink 文件路徑錯誤了。 --test_id_name obama2，請修改這裏確保和之前生成的npz名字一致（路徑是examples/test-result/），比如obama2.npz。

h310558606 commented 3 years ago

谢谢，问题解决了。还有两个问题想请教一下： 1.使用中文的语音，发现嘴形对的不是很好，但是应该不是因为训练集是英文的原因，因为我在另一个项目中，也使用了英文数据集训练的模型，发现嘴形对的就比较自然。总感觉奥巴马平时说话的时候这个嘴就很奇怪，能不能换一个人呢，不喜欢看他，哈哈哈。 2.更换reference video后的处理代码能不能提供一下呢，感觉这个也是大家很关心的问题。十分感谢你们给大家开源的好代码。

zhangchenxu528 commented 3 years ago

中文语音嘴型问题我也发现了，我认为可能是由于1，deepspeech提取的feature本质上是字母，所以可能对中文不友好； 2，我的训练集由于全都是英文的，所以也有可能是训练数据问题。

我会在这个月月底或者下月初提供一个新的demo用于新video，包括数据处理，finetuning训练等。谢谢你的关注。

h310558606 commented 3 years ago

好的，谢谢，期待更新。

Xiaocong6 commented 2 years ago

请问 IndexError: index 9 is out of bounds for axis 0 with size 9 这个问题你是怎么解决的呀，我看了 --test_id_name和npz文件是一致的 @h310558606

zhangchenxu528 / FACIAL

替换音频文件后报错 #5