Closed mirbagheri closed 4 years ago
I am not sure, but it could be due to the maximum number of threads that CUDA can launch at once. About the low memory usage and utilization rate, this is due to the fact that you are simulating really short RIRs. You said that you wanted to simulate "fully reflective walls" but with beta = [0.] * 6
and Tmax = 0.0066
you are simulating an anechoic room.
Thanks for the reply. I understand the thread limit, but is there a better way to utilize GPU parallization in scenarios like mine?
You are right! I meant the fully absorptive case.
gpuRIR
is designed to have optimum performance when you have larger reverberations since this is usually the bottleneck of most simulations. In order to better exploit the GPU power in scenarios like yours, maybe it would be good to run several simulations in parallel. CUDA allows you to launch several kernels without waiting for them to finish, but modifying gpuRIR
to do so would mean strong changes in its architecture. Therefore, I guess it will be easier using CPU parallelization in your Python script to run several gpuRIR.simulateRIR
calls at once. I have never done that, but I think gpuRIR should be thread-safe as long as all the threads are using the same LUT and MixedPrecision modes.
Just out of curiosity, how long it takes to simulate 2000 anechoic simulations?
Thanks for the tip! I tested the CPU parallelization as below and it's working.
import os
from multiprocessing import Pool
def func(cnt):
import gpuRIR
gpuRIR.activateMixedPrecision(False)
gpuRIR.activateLUT(True)
return gpuRIR.simulateRIR([4, 4, 4], [0.] * 6, np.array([[3, 2, 2]]),
2 * np.ones((2000, 3)),
[2] * 3, 0.0066, 44100, mic_pattern='omni')
if __name__ == '__main__':
p = Pool(os.cpu_count())
results = p.map(func, range(1000))
Note that the gpuRIR should be imported inside func otherwise you get an "Incorrect Initialization" Error. This took 3.49 s ± 14.3 ms to finish with 28 Intel Xeon CPU cores.
What exactly causes the GPUassert: initialization error /tmp/pip-req-build-i673lsff/src/gpuRIR_cuda.cu 764
?
I am trying to use gpuRIR in a gym.vector.AsyncVectorEnv
(which relies on Python's native Process
library).
gpuRIR
shoud be instanciated inside each process right ?
I'm afraid I've never worked with gpuRIR
(or anything using CUDA) in multiprocessing scenarios, so I cannot offer too much help with this. I made a quick search in Google and it seems like you only can initialize the CUDA context once in your program and many answers in Pytorch issues and questions point here as the solution: https://pytorch.org/docs/master/notes/multiprocessing.html#cuda-in-multiprocessing
Maybe you could use the context
arg in the gym.vector.AsyncVectorEnv
construction to start your subprocesses in spawn
or forkserver
mode?
Thank you so much for this quick answer.
I had to change a bit the code for gym
's AsyncVectorEnv
but it indeed works with context=torch.multiprocessing.get_context('spawn')
.
Have a great day
I'm trying to simulate RIRs on a 4 x 4 x 4 room with fully reflective walls with a single reciever at the center of the room and 2600 source positions randomly chosen on the unit sphere centered at [2, 2, 2]
gpuRIR was installed using pip on a machine with 128 GB cpu memory and a Nvidia Titan Xp GPU with 12Gb memory with Ubuntu 18.
These are the parameters I use to run gpuRIR.simulateRIR: beta = [0.] 6 nb_img = [2] 3 Tmax = 0.0066 Tdiff = None mic_pattern='omni'
gpuRIR.simulateRIR function sometimes fails with the error: GPUassert: invalid argument /tmp/pip-req-build-93zmrk8r/src/gpuRIR_cuda.cu 795
The error goes away if I reduce the pos_src chunk size to 2000 x 3. This is while GPU memory usage never exceeds 200 mb and the utilization rate is always below 4-5%. Any idea on what might be causing this issue?