pytorch / audio

Data manipulation and transformation for audio signal processing, powered by PyTorch
https://pytorch.org/audio
BSD 2-Clause "Simplified" License
2.5k stars 644 forks source link

Torchaudio 2.1 Fails to Load and Save Audios with More than 16 Channels #3659

Closed mravanelli closed 10 months ago

mravanelli commented 11 months ago

🐛 Describe the bug

Hi, I noticed this weird behavior and I would like to report:

# This works
signal = torch.rand(16, 16000)
torchaudio.save('signal_multichannel.wav', signal, 16000)

# This does not work
signal = torch.rand(17, 16000)
torchaudio.save('signal_multichannel.wav', signal, 16000)

Versions

Versions of relevant libraries: [pip3] flake8==3.7.9 [pip3] mypy-extensions==0.4.3 [pip3] numpy==1.21.5 [pip3] numpydoc==1.4.0 [pip3] torch==2.1.0 [pip3] torchaudio==2.1.0 [pip3] torchinfo==1.8.0 [pip3] torchvision==0.16.0 [pip3] triton==2.1.0 [conda] blas 1.0 mkl
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py39h7f8727e_0
[conda] mkl_fft 1.3.1 py39hd3c417c_0
[conda] mkl_random 1.2.2 py39h51133e4_0
[conda] numpy 1.21.5 py39h6c91a56_3
[conda] numpy-base 1.21.5 py39ha15fc14_3
[conda] numpydoc 1.4.0 py39h06a4308_0
[conda] torch 2.1.0 pypi_0 pypi [conda] torchaudio 2.1.0 pypi_0 pypi [conda] torchinfo 1.8.0 pypi_0 pypi [conda] torchvision 0.16.0 pypi_0 pypi [conda] triton 2.1.0 pypi_0 pypi

mthrok commented 11 months ago

Hi @mravanelli

Thanks for the report. This is related to backend migration. It seems that FFmpeg does not handle certain number of channels well, because FFmpeg considers channel_layout as first citizen and the number of channels as derived property of channel layout, and FFmpeg thinks that there is no good channel_layout defined for 17 channels.

Can you provide backend="sox" to these calls? It will fallback to the same old sox backend, and it should work.

mravanelli commented 11 months ago

Yes, with sox it works. It is something related to FFmpeg. It is not a big issue though. I think it would be great to raise an error easier to interpret when people are using the FFmpeg backend with more than 16 channels.

On Tue, Oct 17, 2023 at 2:21 PM moto @.***> wrote:

Hi @mravanelli https://github.com/mravanelli

Thanks for the report. This is related to backend migration. It seems that FFmpeg does not handle certain number of channels well, because FFmpeg considers channel_layout as first citizen and the number of channels as derived property of channel layout, and FFmpeg thinks that there is no good channel_layout defined for 17 channels.

Can you provide backend="sox" to these calls? It will fallback to the same old sox backend, and it should work.

— Reply to this email directly, view it on GitHub https://github.com/pytorch/audio/issues/3659#issuecomment-1766936396, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEA2ZVUIJCQ5TOZV46S7T73X73EC3AVCNFSM6AAAAAA6BIICLOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRWHEZTMMZZGY . You are receiving this because you were mentioned.Message ID: @.***>