Open WangRui-debug opened 2 years ago
Hello, there is not function for that in pyroomacoustics yet. There are two possibilities.
You could try to use the following python code based on Habets et al. paper. I would like to add it to the package, but I am not sure yet how.
def generate_diffuse_noise(
mic_array,
n_samples=None,
noise_sample=None,
n_fft=2048,
hop=512,
fs=16000,
c=343,
):
"""
Generates spherical diffuse noise according to the method described in
Habets et al., Generating nonstationary multisensor signals under a spatial coherence constraint, JASA, 2008.
Parameters
----------
mic_array: (n_mics, n_dim)
The locations of the sensors (in meters)
n_samples: int
The length of the target signal
noise_sample: numpy.ndarray
A template noise signal
n_fft: int
The length of the fft in the STFT
hop: int
The shift of the STFT
fs: int
The sampling frequency (in Hertz)
c: float
The speed of sound (in meters/second)
Returns
-------
The diffuse noise signal with shape (n_samples, n_mics)
"""
if noise_sample is None and n_samples is None:
raise ValueError(
"One of the desired length `n_samples` or a noise sample `noise_sample` must be provided"
)
elif n_samples is None:
n_samples = noise_sample.shape[0]
if noise_sample.ndim > 1:
noise_sample = noise_sample[:, 0]
n_mics = mic_array.shape[0]
# STFT params
win_a = pra.hamming(n_fft)
win_s = pra.transform.compute_synthesis_window(win_a, hop)
# compute the coherence matrix
coh = compute_coherence_theory(mic_array, fs, c, n_fft)
eigval, eigvec = np.linalg.eigh(coh)
eigval = np.maximum(eigval, 0.0)
shaping_matrix = eigvec * np.sqrt(eigval[:, None, :])
# generate the sensor noise (n_frames, n_freq, n_chan)
shape = ((n_samples + n_fft) // hop, n_fft // 2 + 1, n_mics)
N = np.random.randn(*shape) + 1j * np.random.randn(*shape)
if noise_sample is not None:
A = pra.transform.stft.analysis(noise_sample, n_fft, hop, win=win_a)
l_max = min(N.shape[0], A.shape[0])
N[:l_max, :, :] *= abs(A[:l_max, :, None])
diffuse_fd = np.einsum("fcd,nfd->nfc", shaping_matrix, N)
diffuse = pra.transform.stft.synthesis(diffuse_fd, n_fft, hop, win=win_s)
if noise_sample is not None:
# restore original power
std_orig = np.sqrt(np.mean(noise_sample ** 2))
std_gen = np.sqrt(np.mean(diffuse ** 2))
diffuse *= std_orig / std_gen
return diffuse[:n_samples, :]
Thank you very much for your quick resond! I will try these two ways. @fakufaku And by the way, could you please tell me how to contral SNR in these two cases? I want to add diffuse noise with desired SNR. Thank you very much!
Sorry, my response is pretty late. I hope it is still relevant.
Controlling the SNR is the same as usual. Generate your signal and noise separately, then adjust the weight of one of them (usually the noise) to achieve the desired SNR.
# we already prepared `signal` and `noise` signals of the right size.
target_snr = 11.5 # for the sake of the example
sigma_signal = np.sqrt(np.mean(signal ** 2))
noise_multiplier = sigma_signal * 10 ** (-target_snr / 20.0)
# we now create a mixture with target snr
mix = signal + noise_multiplier * noise
# we can check
pwr_signal = np.mean(signal ** 2)
pwr_noise = np.mean((mix - signal) ** 2)
assert 10.0 * np.log10(pwr_signal / pwr_noise) == target_snr
@fakufaku Thanks for sharing a python implementation to generate diffuse noise. While I want to try this functionality out, I'm unable to locate the function compute_coherence_theory
in the shared code snippet. Any chance you still have the definition or python impl reference?
@DanTremonti I have a candidate implementation in a separate branch: https://github.com/LCAV/pyroomacoustics/blob/robin/new/noise/pyroomacoustics/noise.py
Thanks a bunch @fakufaku !
@fakufaku Due to broadcasting I am not able to perform einsum function , shaping_matrix and N have different number of dimensions. Can you please guide me through it.
diffuse_fd = np.einsum("fcd,nfd->nfc", shaping_matrix, N)
File "/home/ubuntu/.local/lib/python3.10/site-packages/numpy/core/einsumfunc.py", line 1371, in einsum return c_einsum(*operands, **kwargs) ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (257,3,3)->(257,3,3) (627,257,4)->(627,257,newaxis,4)
I'm trying to creat a room with serval speakers and a stable background noise. In this case, I hope adding diffuse noise in this room rather than adding point source noise in one position. Could you please tell me how to implement it? Thank you very much!