jiseong1209 commented 1 year ago

코랩에서 훈련을 진행하고 합성도 진행중인데 어제까지는 infer.py를 실행했을때 합성이 원활하게 잘 이루어졌었는데 갑자기 load chunks from temp

=====segment start, 18.451s======

jump empty segment

=====segment start, 8.493s======

Traceback (most recent call last): File "/content/drive/.shortcut-targets-by-id/1chxpJIvkRLuABKDPnoSxcqkPmK8PHcfX/diff-svc/infer.py", line 102, in run_clip(model, key=tran, acc=accelerate, use_crepe=True, thre=0.05, use_pe=True, use_gt_mel=False, File "/content/drive/.shortcut-targets-by-id/1chxpJIvkRLuABKDPnoSxcqkPmK8PHcfX/diff-svc/infer.py", line 59, in run_clip _f0_tst, _f0_pred, _audio = svc_model.infer(raw_path, key=key, acc=acc, use_pe=use_pe, use_crepe=use_crepe, File "/content/drive/.shortcut-targets-by-id/1chxpJIvkRLuABKDPnoSxcqkPmK8PHcfX/diff-svc/infer_tools/infer_tool.py", line 143, in infer batch = self.pre(in_path, acc, use_crepe, thre) File "/content/drive/.shortcut-targets-by-id/1chxpJIvkRLuABKDPnoSxcqkPmK8PHcfX/diff-svc/infer_tools/infer_tool.py", line 275, in pre temp_dict = self.temporary_dict2processed_input(item_name, temp_dict, use_crepe, thre) File "/content/drive/.shortcut-targets-by-id/1chxpJIvkRLuABKDPnoSxcqkPmK8PHcfX/diff-svc/infer_tools/infer_tool.py", line 247, in temporary_dict2processed_input wav, mel = VOCODERS[hparams['vocoder'].split('.')[-1]].wav2spec(temp_dict['wav_fn']) File "/content/drive/.shortcut-targets-by-id/1chxpJIvkRLuABKDPnoSxcqkPmK8PHcfX/diff-svc/network/vocoders/nsf_hifigan.py", line 89, in wav2spec mel_torch = stft.get_mel(wav_torch.unsqueeze(0).to(device)).squeeze(0).T File "/content/drive/.shortcut-targets-by-id/1chxpJIvkRLuABKDPnoSxcqkPmK8PHcfX/diff-svc/modules/nsf_hifigan/nvSTFT.py", line 95, in get_mel spec = torch.stft(y, n_fft, hop_length=hop_length, win_length=win_size, window=self.hann_window[str(y.device)], File "/usr/local/lib/python3.9/dist-packages/torch/functional.py", line 641, in stft return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore[attr-defined] RuntimeError: stft requires the return_complex parameter be given for real inputs, and will further require that return_complex=True in a future PyTorch release. 이런 오류가 발생하게 되었습니다. 혹시 해결책을 알고 계신가요?

MoorDev commented 1 year ago

타겟을 16비트 모노 WAV파일로 바꿔서 해보세요 infer할때 WAV가 아니면 WAV로 변환작업 후에 파일을 읽어들이는데 파일이 제대로 안 읽혔다는것으로 보아 권한문제 등으로 WAV변환이 안 됐거나 인식이 안 되는 WAV파일을 입력하신것 같습니다

jiseong1209 commented 1 year ago

diff-svc 디스코드에 들어가서 확인해보니 코랩에서 torch버전 때문에 발생하는 문제인것 같습니다 pip install torch==1.13.1 torchvision torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117 이 코드를 추가해서 버전을 바꾸니 해결되었습니다!

wlsdml1114 / diff-svc

infer.py 실행과정에서 오류발생 #29

=====segment start, 18.451s======

=====segment start, 8.493s======