spatialaudio / python-sounddevice

:sound: Play and Record Sound with Python :snake:
https://python-sounddevice.readthedocs.io/
MIT License
982 stars 145 forks source link

I hear crackling sound when using 32bit integer numpy arrays to play audio #347

Open physxP opened 3 years ago

physxP commented 3 years ago

Code to reproduce the issue:

import numpy as np
import sounddevice as sd

frequency = 1000
sample_rate = 44100
duration  = 2 # seconds
num_frames = int(duration*sample_rate)

t = np.arange(num_frames) / sample_rate
max_amp = 2**31-1 # maximum value of signed int32
signal = max_amp*np.sin(2 * np.pi * frequency * t)
signal = signal.astype(np.int32)
sd.play(signal,samplerate=sample_rate)

Current OS: Windows 10 Things I have tried:

HaHeho commented 3 years ago

What is the reason for that you say that you need to work with int32 (your original data / processing generates that)?

In any case, you should be able use .astype(np.float32) (and potentially scale the amplitude) to make it work, right?

HaHeho commented 3 years ago

I have tested a bit on my system (Win10) with int32:

import numpy as np
import sounddevice as sd

frequency = 1000
sample_rate = 44100
duration = 1  # seconds
amplitude = 0.1  # -20 dB, clean
amplitude = 1  # 0 dB, results in crackles

num_frames = int(duration * sample_rate)
t = np.arange(num_frames) / sample_rate
signal = amplitude * np.sin(2 * np.pi * frequency * t)

print(f'play dtye: {signal.dtype}, max: {np.max(signal):.2f}')
sd.play(signal, samplerate=sample_rate, blocking=True)

signal *= 2 ** 31 - 1  # maximum value of signed int32
signal = signal.astype(np.int32)

print(f'play dtye: {signal.dtype}, max: {np.max(signal):.2f}')
sd.play(signal, samplerate=sample_rate, blocking=True)

What is suspicious is that amplitude = 0.1 gives a clean output with identical volume for float and int. Hence, the amplitude scaling is correct).

However, amplitude = 1 gives the crackling output, as if the signal is clipping, although the value range should be fine, right?

physxP commented 3 years ago

What is the reason for that you say that you need to work with int32 (your original data / processing generates that)?

In any case, you should be able use .astype(np.float32) (and potentially scale the amplitude) to make it work, right?

I am working with a tight constraint on precision. The loss of precision in conversion from int32 to float32 is not desirable. Also i need to work with 24bits and from what I read, I think sound-device only works with int24.I have same issue with int24

physxP commented 3 years ago

I have tested a bit on my system (Win10) with int32:

import numpy as np
import sounddevice as sd

frequency = 1000
sample_rate = 44100
duration = 1  # seconds
amplitude = 0.1  # -20 dB, clean
amplitude = 1  # 0 dB, results in crackles

num_frames = int(duration * sample_rate)
t = np.arange(num_frames) / sample_rate
signal = amplitude * np.sin(2 * np.pi * frequency * t)

print(f'play dtye: {signal.dtype}, max: {np.max(signal):.2f}')
sd.play(signal, samplerate=sample_rate, blocking=True)

signal *= 2 ** 31 - 1  # maximum value of signed int32
signal = signal.astype(np.int32)

print(f'play dtye: {signal.dtype}, max: {np.max(signal):.2f}')
sd.play(signal, samplerate=sample_rate, blocking=True)

What is suspicious is that amplitude = 0.1 gives a clean output with identical volume for float and int. Hence, the amplitude scaling is correct).

However, amplitude = 1 gives the crackling output, as if the signal is clipping, although the value range should be fine, right?

Yes it should work but something wonky is going on. Even if I choose amplitude of 0.99 the crackling sound is gone. Going to check the frequency spectrum of the integer signal to check for noise Update: I checked the code on ubuntu20.04 and it is working without the crackling noise. The problem might be on windows side

mgeier commented 3 years ago

Did you try different host APIs on Windows?

physxP commented 3 years ago

Can you please tell me how can I do that? I couldn't find anything related to changing host APIs in the documentation

mgeier commented 3 years ago

If there are multiple host APIs, your physical devices will appear multiple times in the device list. Each entry contains the host API name after the device name.

So by selecting a specific device from the list, you automatically select a host API at the same time.

I couldn't find anything related to changing host APIs in the documentation

Yeah, I think that's not mentioned explicitly.

Do you have an idea how and where we could document this? Would you like to make a documentation PR for this?

physxP commented 3 years ago

Thanks for the quick response. Changing the host api fixed the issue! I tried all the listed APIs and switching to the "Windows WDM-KS" API fixed that. Lists of APIs and results for future reference:

Do you have an idea how and where we could document this? Would you like to make a documentation PR for this?

Sure, I would love to help. I think we can mention this in Device Selection. We can modify the heading to something like 'Device and Host API Selection'. We can also add a troubleshooting section in which we can mention common issues and their fixes starting with this one. What do you think?

mgeier commented 3 years ago

Sounds good, I'm looking forward to a PR!

About the crackling issue: It's great to hear that it works with WDM-KS. I guess the issues with the other host APIs will have to be fixed in https://github.com/PortAudio/portaudio. Would you like to open an issue there?