microsoft / DNS-Challenge

This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
Creative Commons Attribution 4.0 International
1.1k stars 411 forks source link

About generating datasets #99

Open someone4194 opened 2 years ago

someone4194 commented 2 years ago

In noisyspeech_synthesizer_singleprocess.py and pdns_noisyspeech_synthesizer_singleprocess.py,

Some csv files(ex. 'rir_table_csv','clean_speech_t60_csv' and 'spkid_csv') are essential to generate the training datasets, but I can't find them in the repository. How can I download them?

motus commented 2 years ago

@someone4194 These csv files were in git and you can still find them in the history (see commits prior to 5b0e929d09abc1b146716e5a138898e3d7a4177b). I guess we just have to update the Python code to not use the data. @hdubey will fix it and let you know

zhaoyj1122 commented 2 years ago

@motus Hi motus, thanks for your reply. But the csv files you mentioned are only for 16K simulated IR data used in DNS3. The IR information like T60 cannot be able to use for 48K simulated/real-recorded IR data in the current DNS4. Could you please release the corresponding csv files for RIR in commits https://github.com/microsoft/DNS-Challenge/commit/cef3b749ba4e6f94423914fc72e1ef66ec7c6fad ?

someone4194 commented 2 years ago

@motus I found the csv files but there are some problems to use them directly (because these are for DNS3 dataset).

Additionally, as @zhaoyj1122 mentioned, there is no csv file for real-recorded/simulated 48K IR data ... Is there any plan to provide new csv files for DNS4 dataset or revised noisyspeech synthesis script?

eagomez2 commented 2 years ago

If I am not mistaken, SLR26 and SLR28 RIR datasets are originally 16kHz, therefore I believe the provided data is resampled from this one.

For some reason, the dataset of DNS Challenge v4 regarding RIRs, seems to be an upsampled subset of the DNS Challenge v3, therefore deleting the rows corresponding to the previous DNS as well as replacing the 16k string for 48k should make it usable again. Another question I had is what tool do the Challenge team use to estimate the RT60 of these files? At a first glance, if I compare what I get using pyroomacoustics with what is annotated in the .csv file I get large differences and I also noticed that some entries have very large numbers (for example datasets/impulse_responses/SLR26/simulated_rirs_16k/largeroom/Room001/Room001-00014.wav has 774 in the T60_WB column).

Thanks!

amie-roten commented 2 years ago

Question related to the comment above from @eagomez2 : I'm curious what method was used to resample these SLR26/28 RIR datasets from 16kHz to 48kHz. Thanks for any information!

veera-puthiran-14082 commented 2 years ago

@someone4194 I guess we just have to update the Python code to not use the data. @hdubey will fix it and let you know

Is it fixed now ?

nicriverhoo commented 2 years ago

and, If I am not mistaken, the RIRs for 16KHz couldnot simply be upsampled and then be used for fullband, since RIRs is different from a sampled signal, it that right?

sipvoip commented 2 years ago

@someone4194 These csv files were in git and you can still find them in the history (see commits prior to 5b0e929). I guess we just have to update the Python code to not use the data. @hdubey will fix it and let you know

Any update on this? I am, not clear how to get around this issue.