Open hwangjeff opened 1 year ago
Related discussion in core: https://github.com/pytorch/pytorch/issues/81102. Ffmpeg integration is currently overlapped/duplicated between torchvision and torchaudio. It would be cool if it moved to a single implementation (in a new / separate package?)
Also supporting eliminating global backend state, and forcing user to maintain this selection themselves if they want to use a non-default backend.
Hi @vadimkantorov — thanks for flagging. Somewhat independently of this particular issue, we are indeed considering consolidating media I/O in a separate package. We'll post updates on the outcomes of our discussions to https://github.com/pytorch/pytorch/issues/81102.
Overview
We propose the following end state for TorchAudio’s I/O functions info, load, save:
Context
TorchAudio’s functions info, load, and save currently rely on two third-party libraries: SoX and soundfile. Whereas SoX is used in the Linux and Mac distributions, soundfile is used in the Windows distribution.
Through the years, we’ve encountered several issues with SoX:
Separately, our work around streaming I/O introduced FFmpeg as a dependency. FFmpeg's advantages over SoX include the following:
End state
To address the issues above, we propose the following end state:
We anticipate this end state bringing greater cross-platform consistency, simplifying our codebase, and delivering an improved user experience.
Plan
Release 2.0
Release 2.1
torchaudio.sox_effects
. #3497Release 2.2