sp-uhh / sgmse

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation
MIT License
454 stars 69 forks source link

Changing the sample_rate of loaded audio file to 16000 #21

Closed Aaryan369 closed 2 months ago

Aaryan369 commented 1 year ago

Hi, This pull request adds a line of code to resample the audio after loading it to prevent audio stretching. The original audio was loaded at a sample rate of 48000Hz, but to maintain consistency with the rest of the codebase, it is now resampled to 16000Hz using the torchaudio.functional.resample function. This change was necessary because the mismatch between the sample rates during loading and processing was causing the audio to stretch, leading to a difference in length of up to 3 times the original audio. By resampling the audio, this pull request ensures that the audio remains at its original length, improving the accuracy of any downstream processing.

julius-richter commented 2 months ago

Fixed in latest commit