pytorch / audio

Data manipulation and transformation for audio signal processing, powered by PyTorch
https://pytorch.org/audio
BSD 2-Clause "Simplified" License
2.43k stars 636 forks source link

TorchAudio Dispatcher Migration #2950

Open hwangjeff opened 1 year ago

hwangjeff commented 1 year ago

Overview

We propose the following end state for TorchAudio’s I/O functions info, load, save:

Through the years, we’ve encountered several issues with SoX:

Separately, our work around streaming I/O introduced FFmpeg as a dependency. FFmpeg's advantages over SoX include the following:

End state

To address the issues above, we propose the following end state:

We anticipate this end state bringing greater cross-platform consistency, simplifying our codebase, and delivering an improved user experience.

Plan

Release 2.0

Release 2.1

Release 2.2

vadimkantorov commented 1 year ago

Related discussion in core: https://github.com/pytorch/pytorch/issues/81102. Ffmpeg integration is currently overlapped/duplicated between torchvision and torchaudio. It would be cool if it moved to a single implementation (in a new / separate package?)

Also supporting eliminating global backend state, and forcing user to maintain this selection themselves if they want to use a non-default backend.

hwangjeff commented 1 year ago

Hi @vadimkantorov — thanks for flagging. Somewhat independently of this particular issue, we are indeed considering consolidating media I/O in a separate package. We'll post updates on the outcomes of our discussions to https://github.com/pytorch/pytorch/issues/81102.