Open someone4194 opened 2 years ago
@someone4194 These csv files were in git and you can still find them in the history (see commits prior to 5b0e929d09abc1b146716e5a138898e3d7a4177b). I guess we just have to update the Python code to not use the data. @hdubey will fix it and let you know
@motus Hi motus, thanks for your reply. But the csv files you mentioned are only for 16K simulated IR data used in DNS3. The IR information like T60 cannot be able to use for 48K simulated/real-recorded IR data in the current DNS4. Could you please release the corresponding csv files for RIR in commits https://github.com/microsoft/DNS-Challenge/commit/cef3b749ba4e6f94423914fc72e1ef66ec7c6fad ?
@motus I found the csv files but there are some problems to use them directly (because these are for DNS3 dataset).
Additionally, as @zhaoyj1122 mentioned, there is no csv file for real-recorded/simulated 48K IR data ... Is there any plan to provide new csv files for DNS4 dataset or revised noisyspeech synthesis script?
If I am not mistaken, SLR26 and SLR28 RIR datasets are originally 16kHz, therefore I believe the provided data is resampled from this one.
For some reason, the dataset of DNS Challenge v4 regarding RIRs, seems to be an upsampled subset of the DNS Challenge v3, therefore deleting the rows corresponding to the previous DNS as well as replacing the 16k
string for 48k
should make it usable again. Another question I had is what tool do the Challenge team use to estimate the RT60 of these files? At a first glance, if I compare what I get using pyroomacoustics
with what is annotated in the .csv
file I get large differences and I also noticed that some entries have very large numbers (for example datasets/impulse_responses/SLR26/simulated_rirs_16k/largeroom/Room001/Room001-00014.wav
has 774
in the T60_WB
column).
Thanks!
Question related to the comment above from @eagomez2 : I'm curious what method was used to resample these SLR26/28 RIR datasets from 16kHz to 48kHz. Thanks for any information!
@someone4194 I guess we just have to update the Python code to not use the data. @hdubey will fix it and let you know
Is it fixed now ?
and, If I am not mistaken, the RIRs for 16KHz couldnot simply be upsampled and then be used for fullband, since RIRs is different from a sampled signal, it that right?
@someone4194 These csv files were in git and you can still find them in the history (see commits prior to 5b0e929). I guess we just have to update the Python code to not use the data. @hdubey will fix it and let you know
Any update on this? I am, not clear how to get around this issue.
In noisyspeech_synthesizer_singleprocess.py and pdns_noisyspeech_synthesizer_singleprocess.py,
Some csv files(ex. 'rir_table_csv','clean_speech_t60_csv' and 'spkid_csv') are essential to generate the training datasets, but I can't find them in the repository. How can I download them?