Impulse responces of gpuRIR very different compared with pyroomacoutics with similar settings

fra1993 commented 2 years ago

Hi,

while trying to obtain similar RIRs with gpuRIR and pyroomacoustics, I alwayse obtain very different filters for similar acoustic settings: it looks like the filters generated with gpuRIR are ~ one order of magnitude smaller. Con you give me any hints?

Thank you very much

DavidDiazGuerra commented 2 years ago

Hi Fra,

Could you provide a minimal example where this behavior can be observed and, if possible, a plot of the RIRs obtained with it?

Best regards, David

fra1993 commented 2 years ago

Here is the filter generated with gpuRIR: gpuRIR

Here the filter from pyroomacoustics: pyroom

The room dimensions were set to: [4.585, 6.903, 3.144]

For generating the absorption coefficients and the max number or reflections I used in both of the cases the Sabine formula (available in both of the packages) with T60=0.6. Here I could already see something weird: from pyroom I get absorption_coeff=0.19714386610054616 instead gpuRIR implementation of the Sabine formula gives it a value of 0.8961001. On the other side, for the max_reflection_order I get 79 from pyroom and [23, 15, 33] for gpuRIR (71 in total).

DavidDiazGuerra commented 2 years ago

Thanks for reporting this. There might be some differences between gpuRIR and pyroomacoustics since gpuRIR uses negative reflection coefficients and (optionally) a random late reverberation model, but this seems to be indeed too much difference. I'll take a look at this during the week.

Best regards, David

fra1993 commented 2 years ago

Yes I noticed from the paper. In fact I removed the late reverb. part not setting the Tdiff argument in the simulateRIR function.

DavidDiazGuerra commented 2 years ago

The implementations of the Sabine formula from both libraries are equivalent, the difference is due to gpuRIR returning the energy reflection coefficients while pyroomacoustics returning the energy absorption coefficients. You can see how alpha = 1 - beta². Be aware that pyroomacoustics used to work with amplitude coefficients instead of energy coefficients and, if you're using energy coefficients, you must use the parameter material instead of absortion (which is deprecated).

About the RIRs from gpuRIR having less amplitude than the ones from pyroomacoustics, the amplitude of the direct path is expected to be A_dp=1/(4*pi*d). I don't know the distance between the source and the receiver in your simulation, but in order to get an amplitude about A_dp=0.5 you would need a distance of just d=0.16 meters, which is quite low and, from the shape of your RIRs, it doesn't seem to be your case. I guess the higher amplitude in pyroomacoustics could be due to the low-frequency artifact generated by using positive reflection coefficients, but I can't tell for sure.

fra1993 commented 2 years ago

A_dp=1/(4*pi*d) Should be if you use delayed impulses. However I guess that if use the sinc+Hanning window also other images contribute to the direct path, isn't it?

However, here below you can see a plot comparing RIRs from pyroom, gpuRIR and field measured RIRs with similar acoustic parameters from the BUTReverb dataset. On the top of the plot I added the distance between the source and the recording microphone.

RIRs

PS: about pyroom I see your point and I know about the deprecation but I wanted to compare the parameters calculated with the two tools and the coeffs are reasonably in accordance.

DavidDiazGuerra commented 2 years ago

A_dp=1/(4pid) Should be if you use delayed impulses. However I guess that if use the sinc+Hanning window also other images contribute to the direct path, isn't it?

Yeah, that's right. That formula is employed for all the image sources (just using the distance of the image source instead of the original source) and, as you can see, that would generate amplitudes in the same order of magnitude as the ones obtained by gpuRIR. As you say, the use of a windowed sinc function can affect the amplitudes of the different peaks (though it would be even worse if instead of using sinc functions we just rounded the fractional delays) since the side lobes of each peak overlap with the rest of the peaks. Using positive reflections coefficients can make this effect stronger since all the peaks are positive, but I'm not sure whether only this can explain such a big difference.

However, here below you can see a plot comparing RIRs from pyroom, gpuRIR and field measured RIRs with similar acoustic parameters from the BUTReverb dataset. On the top of the plot I added the distance between the source and the recording microphone.

I have to admit I have never compared the results of gpuRIR with field measured RIRs, in order to validate the results of gpuRIR I compared them with the results from the Lehmann's Matlab library. I've run again some Matlab simulations to check if any of the last gpuRIRs updates might have introduced any bug, but everything looks fine:

I don't know how the amplitudes of the BUTReverb dataset have been normalized, since I guess that's not a trivial task. Since usually you only care about the position and the relative amplitudes of each peak but the absolute amplitude doesn't matter (since you usually have to normalize the results after having applied it), maybe they just ensured that the RIRs were never saturated and they were coherent along the dataset.

PS: about pyroom I see your point and I know about the deprecation but I wanted to compare the parameters calculated with the two tools and the coeffs are reasonably in accordance.

If you're using energy absorption coefficients, like the ones provided by pyroomacoustics.inverse_sabine or 1-beta**2 where beta is the result from gpuRIR.beta_SabineEstimation, you must indicate them to pyroomacoustics using materials=pra.Material(e_absorption) when creating the room object. However, if you want to use amplitude absorption coefficients, like 1-beta, you must indicate it using absorption=a_absortion. Mixing them will lead to RIRs with a T60 different than the one expected and may explain why, in your last figures, the RIRs from pyroomacoustics seem to have a much longer T60.

To sum up: I think that the results obtained with gpuRIR are the results that can be expected from the Image Source Method (at least when using negative reflection coefficients) and there's nothing wrong with it. About pyroomacoustics, I don't know if its higher amplitudes can be explained just by the use of positive reflection coefficients or if maybe they're doing some extra processing that can explain it but, in any case, I think it would be better if you ask to its developers.

Best regards, David

ahmadikalkhorani commented 5 months ago

Hi,

I am also observing difference between gpuRIR and (rirgen and rir_generator). Please see below and let me know if I am missing anything:

import rirgen 
import gpuRIR 
import rir_generator 

import numpy as np
import matplotlib.pyplot as plt

T60 = 0.413  # seconds
room_dim = [7.875, 5.839, 3.088]  # meters

fs = 8000 # Hz

room_sz = room_dim  # Size of the room [m]
pos_src = np.array([
    [3.810, 1.919, 1.423],
    ]) # Positions of the sources ([m]
pos_rcv = np.array([
    [3.974, 2.979, 1.418],
    ])   # Position of the receivers [m]

att_diff = 15.0 # Attenuation when start using the diffuse reverberation model [dB]
att_max = 60.0 # Attenuation at the end of the simulation [dB]

beta = gpuRIR.beta_SabineEstimation(room_sz, T60) # Reflection coefficients
Tdiff= gpuRIR.att2t_SabineEstimator(att_diff, T60) # Time to start the diffuse reverberation model [s]
Tmax = gpuRIR.att2t_SabineEstimator(att_max, T60)    # Time to stop the simulation [s]
nb_img = gpuRIR.t2n( Tdiff, room_sz )   # Number of image sources in each dimension

gpuRIR_RIRs = gpuRIR.simulateRIR(room_sz, beta, pos_src, pos_rcv, nb_img, Tmax, fs, Tdiff = Tdiff, c = 343)

rir_generator_RIRs = rir_generator.generate(
            c=343,
            fs=fs,
            r=np.ascontiguousarray(pos_rcv),
            s=np.ascontiguousarray(pos_src[0]),
            L=np.ascontiguousarray(room_dim),
            reverberation_time=T60,
            mtype=rir_generator.mtype.omnidirectional,
        )

rirgen_RIR = rirgen.generate_rir(
            room_measures=room_dim,
            source_position=pos_src[0],
            receiver_positions=np.ascontiguousarray(pos_rcv),
            reverb_time=T60,
            sound_velocity=343,
            fs=fs,
        )

plt.plot(gpuRIR_RIRs[0, 0, :], label = "gpuRIR", color = "k")
plt.plot(rir_generator_RIRs[:, 0], label = "rir_generator")
plt.plot(rirgen_RIR[0], label = "rirgen")

plt.xlim(0, 200)
plt.legend()

DavidDiazGuerra commented 5 months ago

Hello,

As explained in the paper, gpuRIR uses negative reflection coefficients, so you can expect about half of the peaks to be negative. Apart from that, you can see how the timing and the absolute value of the peaks are the same for all the libraries.

In the picture, you can also observe how rirgen has some artifacts after some peaks (e.g. from 25 to 50 ms after the first peak) that are not generated by gpuRIR.

I think those are the only differences between libraries in the picture.

Best, David

DavidDiazGuerra / gpuRIR

Impulse responces of gpuRIR very different compared with pyroomacoutics with similar settings #25