JusperLee / Dual-Path-RNN-Pytorch

Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch
Apache License 2.0
417 stars 65 forks source link

Input Normalization #16

Open JunzheJosephZhu opened 4 years ago

JunzheJosephZhu commented 4 years ago

I'm not sure if my mixing is exactly same as yours, but does your torchaudio read the wav files to int(value is typically around a couple hundred) or to float values between [-1, 1]?

I started with scipy which loads to int, and it caused loss going to NaN at a point. So I switched to librosa which loads to float

JusperLee commented 4 years ago

float values between [-1, 1]. if you started with scipy which loads to int, you will norm audio between [-1,1]