Bump torchaudio from 0.12.1 to 0.13.1

Bumps torchaudio from 0.12.1 to 0.13.1.

Release notes

TorchAudio 0.13.1 Release Note

This is a minor release, which is compatible with PyTorch 1.13.1 and includes bug fixes, improvements and documentation updates. There is no new feature added.

Bug Fix

IO

Make buffer size configurable in ffmpeg file object operations and set size in backend (#2810)

Fix issue with the missing video frame in StreamWriter (#2789)

Fix decimal FPS handling StreamWriter (#2831)

Fix wrong frame allocation in StreamWriter (#2905)

Fix duplicated memory allocation in StreamWriter (#2906)

Model

Fix HuBERT model initialization (#2846, #2886)

Recipe

Fix issues in HuBERT fine-tuning recipe (#2851)

Fix automatic mixed precision in HuBERT pre-training recipe (#2854)

torchaudio 0.13.0 Release Note

Highlights

TorchAudio 0.13.0 release includes:

Source separation models and pre-trained bundles (Hybrid Demucs, ConvTasNet)

New datasets and metadata mode for the SUPERB benchmark

Custom language model support for CTC beam search decoding

StreamWriter for audio and video encoding

[Beta] Source Separation Models and Bundles

Hybrid Demucs is a music source separation model that uses both spectrogram and time domain features. It has demonstrated state-of-the-art performance in the Sony Music DeMixing Challenge. (citation: https://arxiv.org/abs/2111.03600)

The TorchAudio v0.13 release includes the following features

MUSDB_HQ Dataset, which is used in Hybrid Demucs training (docs)

Hybrid Demucs model architecture (docs)

Three factory functions suitable for different sample rate ranges

Pre-trained pipelines (docs) and tutorial

SDR Results of pre-trained pipelines on MUSDB-HQ test set

Pipeline All Drums Bass Other Vocals

HDEMUCS_HIGH_MUSDB* 6.42 7.76 6.51 4.47 6.93

HDEMUCS_HIGH_MUSDB_PLUS** 9.37 11.38 10.53 7.24 8.32

* Trained on the training data of MUSDB-HQ dataset. ** Trained on both training and test sets of MUSDB-HQ and 150 extra songs from an internal database that were specifically produced for Meta.

Special thanks to @adefossez for the guidance.

ConvTasNet model architecture was added in TorchAudio 0.7.0. It is the first source separation model that outperforms the oracle ideal ratio mask. In this release, TorchAudio adds the pre-trained pipeline that is trained within TorchAudio on the Libri2Mix dataset. The pipeline achieves 15.6dB SDR improvement and 15.3dB Si-SNR improvement on the Libri2Mix test set.

[Beta] Datasets and Metadata Mode for SUPERB Benchmarks

With the addition of four new audio-related datasets, there is now support for all downstream tasks in version 1 of the SUPERB benchmark. Furthermore, these datasets support metadata mode through a get_metadata function, which enables faster dataset iteration or preprocessing without the need to load or store waveforms.

Datasets with metadata functionality:

Pipeline	All	Drums	Bass	Other	Vocals
HDEMUCS_HIGH_MUSDB*	6.42	7.76	6.51	4.47	6.93
HDEMUCS_HIGH_MUSDB_PLUS**	9.37	11.38	10.53	7.24	8.32

... (truncated)

Commits

b90d798 Update author and maintainer info (#2911)
4adbd54 Fix duplicated memory allocation in StreamWriter (#2906)
30a1070 Fix wrong frame allocation in StreamWriter (#2905)
fbf968c [Rlease only change] Advance version for nightly (#2903)
6b13e26 Fix _init_hubert_pretrain_model (#2886)
1533b2c Update decoder doc (#2865)
235add9 packaging: Specify otool / install_name_tool (#2828)
0c1e6f2 Fix decimal FPS handling StreamWriter (#2831)
199a6ee Fix issue with the missing video frame in StreamWriter (#2789)
030646c Enable mixed precision training for hubert_pretrain_model (#2854)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

shirayu / whispering

Bump torchaudio from 0.12.1 to 0.13.1 #70

TorchAudio 0.13.1 Release Note

Bug Fix

IO

Model

Recipe

torchaudio 0.13.0 Release Note

Highlights

[Beta] Source Separation Models and Bundles

[Beta] Datasets and Metadata Mode for SUPERB Benchmarks