Two source audios from different directions?

DavidDiazGuerra / gpuRIR

Python library for Room Impulse Response (RIR) simulation with GPU acceleration

GNU Affero General Public License v3.0

488 stars 94 forks source link

Two source audios from different directions? #26

Closed YaguangGong closed 2 years ago

YaguangGong commented 2 years ago

Here is my confusion. I am testing my beamforming algorithm on a dual-mic array. I intend to generate wavs including two source audios playing simultaneously from different directions. I set the pos_src as the coordinates of the two sources audio and got an RIR array of shape (2, 2, rir_len). But how should I do with simulateTrajectory to feed two different audios channels into it? Or did I misunderstand the meaning of pos_src?

DavidDiazGuerra commented 2 years ago

simulateTrajectory is designed to work with moving sources, so each point in pos_src represents a point in the trajectory. In your case, you could call simulateTrajectory first using RIR[0,:,:] and the audio signal of the first source, call it again with RIR[1,:,:] and the signal of the second source, and then sum the results. However, maybe it would be easier if you just use any convolution operation from your preferred CPU library, I don't think gpuRIR will really speed up this operation too much in this scenario.

YaguangGong commented 2 years ago

simulateTrajectory is designed to work with moving sources, so each point in pos_src represents a point in the trajectory. In your case, you could call simulateTrajectory first using RIR[0,:,:] and the audio signal of the first source, call it again with RIR[1,:,:] and the signal of the second source, and then sum the results. However, maybe it would be easier if you just use any convolution operation from your preferred CPU library, I don't think gpuRIR will really speed up this operation too much in this scenario.

Understand. Another issue, I have a batch of mono audios to do convolution op with a batch of RIRs respectively. Is it efficient to use simulateTrajectory to fulfill it? I suppose simulateTrajectory should be faster than other CPU version convolution. Or do you know any other alternatives to work by batch on GPU, because it seems that simulateTrajectory can only process audio one by one?

DavidDiazGuerra commented 2 years ago

it seems that simulateTrajectory can only process audio one by one?

Yes, simulateTrajectory is designed to work with only an audio signal. If you carefully concatenate all your mono audios in a way that all of them have the same duration (including some zero padding in order to avoid their reverberations to overlap) maybe you could make this work, but I've never tried this and it's not what that function was designed for.

do you know any other alternatives to work by batch on GPU

I don't know any, to be honest. Maybe you could do that by using the functions designed for 1D convolutional layers in machine learning libraries (like torch.nn.functional.conv1d in Pytorch) but they might be designed to work only with shorter filters. You could also use libraries that include FFT capabilities in CUDA (Pytorch does that) and implement your own convolution with them. However, I guess there must be libraries that offer more direct approaches, even if I don't know any.

YaguangGong commented 2 years ago

it seems that simulateTrajectory can only process audio one by one?

Yes, simulateTrajectory is designed to work with only an audio signal. If you carefully concatenate all your mono audios in a way that all of them have the same duration (including some zero padding in order to avoid their reverberations to overlap) maybe you could make this work, but I've never tried this and it's not what that function was designed for.

do you know any other alternatives to work by batch on GPU

I don't know any, to be honest. Maybe you could do that by using the functions designed for 1D convolutional layers in machine learning libraries (like torch.nn.functional.conv1d in Pytorch) but they might be designed to work only with shorter filters. You could also use libraries that include FFT capabilities in CUDA (Pytorch does that) and implement your own convolution with them. However, I guess there must be libraries that offer more direct approaches, even if I don't know any.

Got it. Thanks a lot!