pytorch / audio

Data manipulation and transformation for audio signal processing, powered by PyTorch
https://pytorch.org/audio
BSD 2-Clause "Simplified" License
2.51k stars 647 forks source link

Add Free Universal Sound Separation (FUSS) Dataset #534

Open cHemingway opened 4 years ago

cHemingway commented 4 years ago

🚀 Feature

Add the FUSS dataset to pytorch.datasets.

Motivation

While Pytorch Audio contains a variety of clean speech datasets, it does not contain any noise samples or room impulse response samples, making the setup of problems in speech or audio separation (or simply training recognition in the presence of noise/reverb) harder. This dataset will potentially be well supported and used, as it is part of DCASE 2020

Pitch

Implement this under torchaudio.datasets, exposing a very similar API to existing datasets. Both ssdata (dry) and ssdata_reverb (reverberated) should be implemented, potentially as different functions.

Alternatives

An alternative would be adding separate datasets for noise samples (e.g. NoiseX-92 if licensing permits) and room impulse responses, in a format that allows for easy mixing by the user into train/validation/test sets, combined with the existing speech datasets already available in the API.

vincentqb commented 4 years ago

Thanks for suggesting this! Let's take a little bit of time to discuss choice of datasets that would be good to add: #550. Thoughts?

cHemingway commented 4 years ago

Agree this should be decided in general, will close this.

vincentqb commented 4 years ago

Since we do not yet have too many datasets, I'll say that, if you are willing to open a pull request implementing this dataset following our current templates, I'll be happy to review it. :)

I'll re-open this issue in case you would like to work on this. Would you be interested in doing so?