Open Darcy0218 opened 3 months ago
Hello, the spectrogram_torch function in mel_processing.py is used to compute the linear spectrogram, returning a tensor with the shape (1, C, T). In our configuration, C equals n_fft/2 + 1, which is 513. The tensor is then converted to the shape (C, T) by invoking the squeeze function. You can execute this part of the code independently to verify this. The error you’re encountering is likely due to other reasons. It would be helpful if you could provide additional details or a screenshot of the code.
是版本问题,已经解决,谢谢你们!
发生了如下报错 IndexError: Caught IndexError in DataLoader worker process 0. Original Traceback (most recent call last): File "/data4/wuyikai/anaconda3/envs/controlspeech/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop data = fetcher.fetch(index) File "/data4/wuyikai/anaconda3/envs/controlspeech/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 54, in fetch return self.collate_fn(data) File "/data4/wuyikai/ControlSpeech-main/baseline/promptStyle/data_utils.py", line 380, in call torch.LongTensor([x[1].size(1) for x in batch]), File "/data4/wuyikai/ControlSpeech-main/baseline/promptStyle/data_utils.py", line 380, in
torch.LongTensor([x[1].size(1) for x in batch]),
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
原因是 TextAudioSpeakerLoader类的get_audio_text_speaker_pair函数 return (text, spec, wav, sid, mel,style_embed)
但是spec是一维的,
而TextAudioSpeakerCollate类中call函数, _, ids_sorted_decreasing = torch.sort( torch.LongTensor([x[1].size(1) for x in batch]), dim=0, descending=True) 需要调用spec的第二个维度,因此发生维度不一致的错误。
我理解是您在mel_processing.py的spectrogram_torch函数中计算了线性谱,形状为(1,153)然后squeeze后,spec形状为(153),导致了维度不一致的问题,请问这个问题怎么解决?