pytorch / audio

Data manipulation and transformation for audio signal processing, powered by PyTorch
https://pytorch.org/audio
BSD 2-Clause "Simplified" License
2.49k stars 643 forks source link

Data set discussion #116

Open hagenw opened 5 years ago

hagenw commented 5 years ago

Recently, we released audtorch, an audio for PyTorch package that we started some time ago. It contains a few audio data sets that might be worth integrating here: Mozilla Common Voice, AudioSet, VoxCeleb1, LibriSpeech.

But before doing a pull request, there are a few things that I would like to discuss as I'm not completely happy with our current implementation:

Inherit from an Audio Base class or not?

What is the best way to handle the sampling rate?

Should we handle failures during data loading?

Note, our data sets currently return the data as numpy arrays as we use a lot of numpy transforms. But this can easily be changed.

cpuhrsch commented 5 years ago

Hello @hagenw,

Thank you for opening this issue. I can't quite give you a good reply to this just yet, because dataset abstractions are a topic we might need to revisit at a grand scale. I'll read through this in detail soon, but I want to signal you're heard via this reply.

Thanks, Christian

hagenw commented 5 years ago

Thanks for replying. There is no need to rush. The topic is indeed not trivial and it might be a good idea to make the right decisions at the beginning.

vincentqb commented 5 years ago

For reference: pytorch/pytorch#24915