I'm not sure if my mixing is exactly same as yours, but does your torchaudio read the wav files to int(value is typically around a couple hundred) or to float values between [-1, 1]?
I started with scipy which loads to int, and it caused loss going to NaN at a point. So I switched to librosa which loads to float
I'm not sure if my mixing is exactly same as yours, but does your torchaudio read the wav files to int(value is typically around a couple hundred) or to float values between [-1, 1]?
I started with scipy which loads to int, and it caused loss going to NaN at a point. So I switched to librosa which loads to float