pytorch / audio

Data manipulation and transformation for audio signal processing, powered by PyTorch
https://pytorch.org/audio
BSD 2-Clause "Simplified" License
2.5k stars 644 forks source link

Unsupported subtype: PCM_24 #3806

Open nicobrb opened 3 months ago

nicobrb commented 3 months ago

🐛 Describe the bug

By documentation of torchaudio.load(), the expected behaviour for handling 24-bit WAV files is the following:

Since torch does not support int24 dtype, 24-bit signed PCM are converted to int32 tensors.

When calling torchaudio.load() in Windows, using PySoundFile, on a 24-bit WAV file, the rows 222 - 228:


    with soundfile.SoundFile(filepath, "r") as file_:
        if file_.format != "WAV" or normalize:
            dtype = "float32"
        elif file_.subtype not in _SUBTYPE2DTYPE:
            raise ValueError(f"Unsupported subtype: {file_.subtype}")
        else:
            dtype = _SUBTYPE2DTYPE[file_.subtype]

and _SUBTYPE2DTYPE is: _SUBTYPE2DTYPE = { "PCM_S8": "int8", "PCM_U8": "uint8", "PCM_16": "int16", "PCM_32": "int32", "FLOAT": "float32", "DOUBLE": "float64", }

are raising the following error: ValueError: Unsupported subtype: PCM_24

Adding a simple "PCM_24:" "int32" to _SUBTYPE2DTYPE solves the problem.

Versions

PyTorch version: 2.3.1+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A

OS: Microsoft Windows 10 Pro GCC version: (MinGW.org GCC-6.3.0-1) 6.3.0 Clang version: Could not collect CMake version: Could not collect Libc version: N/A

Python version: 3.12.0 (tags/v3.12.0:0fb18b0, Oct 2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)] (64-bit runtime) Python platform: Windows-10-10.0.19045-SP0 Is CUDA available: True CUDA runtime version: 12.1.105 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1050 Nvidia driver version: 555.99 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

CPU: Architecture=9 CurrentClockSpeed=2801 DeviceID=CPU0 Family=198 L2CacheSize=1024 L2CacheSpeed= Manufacturer=GenuineIntel MaxClockSpeed=2801 Name=Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz ProcessorType=3 Revision=

Versions of relevant libraries: [pip3] numpy==1.26.4 [pip3] torch==2.3.1+cu121 [pip3] torchaudio==2.3.1+cu121 [pip3] torchvision==0.18.1+cu121 [conda] Could not collect