Closed suhasbn closed 2 years ago
Hi @suhasbn , could you please provide the code that you are trying to debug here ?
Hi @suhasbn , could you please provide the code that you are trying to debug here ?
Hi Robin, Here's the snippet for your reference.
reverb_param_df = pd.read_csv('trial.csv',engine='python')
scaling_npz = pd.read_csv('trial.csv', engine='python')
utt_ids = scaling_npz['mixture_ID']
for i_utt, output_name in enumerate(utt_ids):
utt_row = reverb_param_df[reverb_param_df['mixture_ID'] == output_name]
room = WhamRoom([utt_row['room_x'].iloc[0], utt_row['room_y'].iloc[0], utt_row['room_z'].iloc[0]],
[[utt_row['micL_x'].iloc[0], utt_row['micL_y'].iloc[0], utt_row['mic_z'].iloc[0]],
[utt_row['micR_x'].iloc[0], utt_row['micR_y'].iloc[0], utt_row['mic_z'].iloc[0]]],
[utt_row['s1_x'].iloc[0], utt_row['s1_y'].iloc[0], utt_row['s1_z'].iloc[0]],
[utt_row['s2_x'].iloc[0], utt_row['s2_y'].iloc[0], utt_row['s2_z'].iloc[0]],
utt_row['T60'].iloc[0])
room.generate_rirs()
Hi @suhasbn , the original errors seems to be triggered by line 53 here. This line checks that the minimum time of arrival at the microphone fits in the array when building the room impulse response.
If you still need help, please provide the code for WhamRoom
, as well as any additional piece of code necessary to reproduce the error.
Hi, I also encountered the same error when trying to generate the WHAMR! data with pyroomacoustics 0.4.1.
You can find the code for WhamRoom
in wham_room.py downloaded from here: https://storage.googleapis.com/whisper-public/whamr_scripts.tar.gz
FYI, after I change to use pyroomacoustics 0.2.0, the error does not appear again.
@fakufaku I'm one of the co-authors of WHAMR, we've been distributing our scripts that depend on pyroomacoustics for about a year, but it appears that something v0.4.1 broke our simulation code. I've extracted a minimal example from our code that reproduces the error:
import numpy as np
import pyroomacoustics as pra
from pyroomacoustics.parameters import constants
room_dim = [8.590185968518231, 6.461586772520692, 3.198773872200343]
mics = [[4.355616339273031, 3.2370661465340023, 0.9741057742912328],
[4.499660767093616, 3.278913521766396, 0.9741057742912328]]
s1_pos = [4.039940301349218, 2.4757959306413455, 1.0066250383911688]
s2_pos = [5.385401336245542, 2.444991127822015, 1.617540956995779]
T60 = 0.44078723695817584
fs = 16000
t0 = 0.0
sigma2_awgn = None
max_rir_len = np.ceil(T60*fs).astype(int)
volume = room_dim[0] * room_dim[1] * room_dim[2]
surface_area = 2*(room_dim[0] * room_dim[1] + room_dim[0] * room_dim[2] + room_dim[1] * room_dim[2])
absorption = 24 * volume * np.log(10.0) / (constants.get('c') * surface_area * T60)
# minimum max order to guarantee complete filter of length T60
max_order = np.ceil(T60 * constants.get('c') / min(room_dim)).astype(int)
room = pra.room.ShoeBox(room_dim, fs=fs, t0=t0, absorption=absorption,
max_order=max_order, sigma2_awgn=sigma2_awgn,
sources=None, mics=None)
room.add_source(s1_pos)
room.add_source(s2_pos)
room.add_microphone_array(pra.MicrophoneArray(np.array(mics).T, fs))
rir = []
room.visibility = None
room.image_source_model()
for m, mic in enumerate(room.mic_array.R.T):
h = []
for s, source in enumerate(room.sources):
h.append(source.get_rir(mic, room.visibility[s][m], room.fs, room.t0)[:max_rir_len])
rir.append(h)
It appears that before v0.4.1, room.t0 was set to a nonzero value when constructed (here), but this is no longer happening.
@gwichern Thanks for the report! I only just realized that pyroomacoustics was used to generate WHAMR! I understand better this issue now!
A lot of things have changed in 0.4.0, sorry for the trouble! In particular:
t0
and sigma2_awgn
have been deprecated (t0
can be fixed directly by changing the source signal, and the snr can be controlled via the simulate method)SoundSource.get_rir
anymore, but in Room.compute_rir
method. It seems that the method in SoundSource.get_rir
is not properly working anymore. I will explicitely deprecate it in the next release.Here is the code example fixed for 0.4.1
import numpy as np
import pyroomacoustics as pra
from pyroomacoustics.parameters import constants
room_dim = [8.590185968518231, 6.461586772520692, 3.198773872200343]
mics = [[4.355616339273031, 3.2370661465340023, 0.9741057742912328],
[4.499660767093616, 3.278913521766396, 0.9741057742912328]]
s1_pos = [4.039940301349218, 2.4757959306413455, 1.0066250383911688]
s2_pos = [5.385401336245542, 2.444991127822015, 1.617540956995779]
T60 = 0.44078723695817584
fs = 16000
t0 = 0.0
sigma2_awgn = None
max_rir_len = np.ceil(T60*fs).astype(int)
volume = room_dim[0] * room_dim[1] * room_dim[2]
surface_area = 2*(room_dim[0] * room_dim[1] + room_dim[0] * room_dim[2] + room_dim[1] * room_dim[2])
absorption = 24 * volume * np.log(10.0) / (constants.get('c') * surface_area * T60)
# minimum max order to guarantee complete filter of length T60
max_order = np.ceil(T60 * constants.get('c') / min(room_dim)).astype(int)
room = pra.room.ShoeBox(room_dim, fs=fs, absorption=absorption, max_order=max_order)
room.add_source(s1_pos)
room.add_source(s2_pos)
room.add_microphone_array(pra.MicrophoneArray(np.array(mics).T, fs))
room.compute_rir()
rir = room.rir
For the dataset, I suppose that there are two solutions:
1) Add a requirements.txt
with the exact version number to generate the dataset
2) Change the code following the example above (I would still add the requirements.txt
file in that case)
I know this is not ideal and apologize for that.
@fakufaku Thanks for the reply. I noticed different lengths for the RIRs returned by your code in v0.4.1 (~19,000 samples and different for each mic-source pair) and my code in v0.3.1 (7053 samples for each mic-source pair). That's quite a large difference. I guess any RIRs generated by recent versions (>0.4) of pyroomacoustics will be different than those from earlier versions?
I agree that we should use a requirements.txt. To keep the dataset consistent with what has already been published we will stick with v0.3.1 for now.
@gwichern I am guilty of not having checked the output consistency between the two versions due to some large changes in the simulator (addition of ray tracing). Although, I have tried to keep consistency in that the output of the shoebox generator should be the same, for most parameter choices. Obviously, the influence of values of t0
, and sigma2_awgn
, which have disappeared cannot be replicated. But in your setup, this should not be important.
The length (as in number of samples) of the RIR may be different, but the actual values should be the same (the algorithm is the same), provided ray tracing is not used and absorption and max_order are set to the same values.
For the dataset, I think it is probably indeed better to fix the requirements file to the version used at the time of the generation of the dataset.
Closing this due to lack of activity.
Hi, I'm having an issue with the generate_rirs() function. It throws the following error:
generate_rirs(room) Traceback (most recent call last):
File "", line 1, in
room.generate_rirs()
File "", line 6, in generate_rirs
self.compute_rir()
File "C:\Users\abc\Documents\wham_room.py", line 44, in compute_rir h.append(source.get_rir(mic, self.visibility[s][m], self.fs, self.t0)[:self.max_rir_len])
File "C:\Users\abc\Anaconda3\lib\site-packages\pyroomacoustics\soundsource.py", line 254, in get_rir fast_rir_builder(ir, time, alpha, visibility.astype(np.int32), Fs, fdl)
File "pyroomacoustics\build_rir.pyx", line 53, in pyroomacoustics.build_rir.fast_rir_builder
AssertionError
Any help would be apreciated!