With project included mix.wav, I can get enh.wav without error.
But when I change to my voice file, I got Runtime Error
Traceback (most recent call last):
File "/Volumes/Coding/gtcrn/infer.py", line 18, in <module>
input = torch.stft(torch.from_numpy(mix), 512, 256, 512, torch.hann_window(512).pow(0.5), return_complex=False)
File "/opt/homebrew/Caskroom/miniconda/base/envs/denoise/lib/python3.10/site-packages/torch/functional.py", line 693, in stft
input = F.pad(input.view(extended_shape), [pad, pad], pad_mode)
File "/opt/homebrew/Caskroom/miniconda/base/envs/denoise/lib/python3.10/site-packages/torch/nn/functional.py", line 4369, in _pad
return torch._C._nn.reflection_pad1d(input, pad)
RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (256, 256) at dimension 2 of input [1, 230400, 2]
With project included mix.wav, I can get enh.wav without error. But when I change to my voice file, I got Runtime Error
ffprobe mix.wav
ffprobe my voice file (elevenlabs16k.wav)
here is my voice file elevenlabs16k.wav.zip hope it can help to debug the issue, unzip it first (Github not allow upload wav file directly).