By documentation of torchaudio.load(), the expected behaviour for handling 24-bit WAV files is the following:
Since torch does not support int24 dtype, 24-bit signed PCM are converted to int32 tensors.
When calling torchaudio.load() in Windows, using PySoundFile, on a 24-bit WAV file, the rows 222 - 228:
with soundfile.SoundFile(filepath, "r") as file_:
if file_.format != "WAV" or normalize:
dtype = "float32"
elif file_.subtype not in _SUBTYPE2DTYPE:
raise ValueError(f"Unsupported subtype: {file_.subtype}")
else:
dtype = _SUBTYPE2DTYPE[file_.subtype]
and _SUBTYPE2DTYPE is:
_SUBTYPE2DTYPE = { "PCM_S8": "int8", "PCM_U8": "uint8", "PCM_16": "int16", "PCM_32": "int32", "FLOAT": "float32", "DOUBLE": "float64", }
are raising the following error:
ValueError: Unsupported subtype: PCM_24
Adding a simple "PCM_24:" "int32" to _SUBTYPE2DTYPE solves the problem.
Versions
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Microsoft Windows 10 Pro
GCC version: (MinGW.org GCC-6.3.0-1) 6.3.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A
Python version: 3.12.0 (tags/v3.12.0:0fb18b0, Oct 2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.19045-SP0
Is CUDA available: True
CUDA runtime version: 12.1.105
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1050
Nvidia driver version: 555.99
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
🐛 Describe the bug
By documentation of
torchaudio.load()
, the expected behaviour for handling 24-bit WAV files is the following:When calling
torchaudio.load()
in Windows, using PySoundFile, on a 24-bit WAV file, the rows 222 - 228:and
_SUBTYPE2DTYPE
is:_SUBTYPE2DTYPE = { "PCM_S8": "int8", "PCM_U8": "uint8", "PCM_16": "int16", "PCM_32": "int32", "FLOAT": "float32", "DOUBLE": "float64", }
are raising the following error:
ValueError: Unsupported subtype: PCM_24
Adding a simple
"PCM_24:" "int32"
to_SUBTYPE2DTYPE
solves the problem.Versions
PyTorch version: 2.3.1+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A
OS: Microsoft Windows 10 Pro GCC version: (MinGW.org GCC-6.3.0-1) 6.3.0 Clang version: Could not collect CMake version: Could not collect Libc version: N/A
Python version: 3.12.0 (tags/v3.12.0:0fb18b0, Oct 2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)] (64-bit runtime) Python platform: Windows-10-10.0.19045-SP0 Is CUDA available: True CUDA runtime version: 12.1.105 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1050 Nvidia driver version: 555.99 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True
CPU: Architecture=9 CurrentClockSpeed=2801 DeviceID=CPU0 Family=198 L2CacheSize=1024 L2CacheSpeed= Manufacturer=GenuineIntel MaxClockSpeed=2801 Name=Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz ProcessorType=3 Revision=
Versions of relevant libraries: [pip3] numpy==1.26.4 [pip3] torch==2.3.1+cu121 [pip3] torchaudio==2.3.1+cu121 [pip3] torchvision==0.18.1+cu121 [conda] Could not collect