Closed jiseong1209 closed 1 year ago
타겟을 16비트 모노 WAV파일로 바꿔서 해보세요 infer할때 WAV가 아니면 WAV로 변환작업 후에 파일을 읽어들이는데 파일이 제대로 안 읽혔다는것으로 보아 권한문제 등으로 WAV변환이 안 됐거나 인식이 안 되는 WAV파일을 입력하신것 같습니다
diff-svc 디스코드에 들어가서 확인해보니 코랩에서 torch버전 때문에 발생하는 문제인것 같습니다 pip install torch==1.13.1 torchvision torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117 이 코드를 추가해서 버전을 바꾸니 해결되었습니다!
코랩에서 훈련을 진행하고 합성도 진행중인데 어제까지는 infer.py를 실행했을때 합성이 원활하게 잘 이루어졌었는데 갑자기 load chunks from temp
=====segment start, 18.451s======
jump empty segment
=====segment start, 8.493s======
Traceback (most recent call last): File "/content/drive/.shortcut-targets-by-id/1chxpJIvkRLuABKDPnoSxcqkPmK8PHcfX/diff-svc/infer.py", line 102, in
run_clip(model, key=tran, acc=accelerate, use_crepe=True, thre=0.05, use_pe=True, use_gt_mel=False,
File "/content/drive/.shortcut-targets-by-id/1chxpJIvkRLuABKDPnoSxcqkPmK8PHcfX/diff-svc/infer.py", line 59, in run_clip
_f0_tst, _f0_pred, _audio = svc_model.infer(raw_path, key=key, acc=acc, use_pe=use_pe, use_crepe=use_crepe,
File "/content/drive/.shortcut-targets-by-id/1chxpJIvkRLuABKDPnoSxcqkPmK8PHcfX/diff-svc/infer_tools/infer_tool.py", line 143, in infer
batch = self.pre(in_path, acc, use_crepe, thre)
File "/content/drive/.shortcut-targets-by-id/1chxpJIvkRLuABKDPnoSxcqkPmK8PHcfX/diff-svc/infer_tools/infer_tool.py", line 275, in pre
temp_dict = self.temporary_dict2processed_input(item_name, temp_dict, use_crepe, thre)
File "/content/drive/.shortcut-targets-by-id/1chxpJIvkRLuABKDPnoSxcqkPmK8PHcfX/diff-svc/infer_tools/infer_tool.py", line 247, in temporary_dict2processed_input
wav, mel = VOCODERS[hparams['vocoder'].split('.')[-1]].wav2spec(temp_dict['wav_fn'])
File "/content/drive/.shortcut-targets-by-id/1chxpJIvkRLuABKDPnoSxcqkPmK8PHcfX/diff-svc/network/vocoders/nsf_hifigan.py", line 89, in wav2spec
mel_torch = stft.get_mel(wav_torch.unsqueeze(0).to(device)).squeeze(0).T
File "/content/drive/.shortcut-targets-by-id/1chxpJIvkRLuABKDPnoSxcqkPmK8PHcfX/diff-svc/modules/nsf_hifigan/nvSTFT.py", line 95, in get_mel
spec = torch.stft(y, n_fft, hop_length=hop_length, win_length=win_size, window=self.hann_window[str(y.device)],
File "/usr/local/lib/python3.9/dist-packages/torch/functional.py", line 641, in stft
return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore[attr-defined]
RuntimeError: stft requires the return_complex parameter be given for real inputs, and will further require that return_complex=True in a future PyTorch release.
이런 오류가 발생하게 되었습니다. 혹시 해결책을 알고 계신가요?