fgnt / sms_wsj

SMS-WSJ: Spatialized Multi-Speaker Wall Street Journal database for multi-channel source separation and recognition
MIT License
101 stars 23 forks source link

16 kHz dataset #30

Open kfmn opened 5 months ago

kfmn commented 5 months ago

Hello,

Thank you for the efforts on creating SMS-WSJ. Is there a simple way to generate similar 16 kHz dataset instead of 8 kHz one?

Best regards, Maxim

boeddeker commented 5 months ago

Hello,

the code has some preparations to work with any sample rate, but defaults are set to 8 kHz.

Here are some guidelines, how you can generate the data with 16 kHz:

On Zenodo are the 8 kHz RIRs, so the 16 kHz have to be generated (takes some time). To do that, you should change mpiexec -np ${num_jobs} python -m sms_wsj.database.create_rirs database_path=$(RIR_DIR) to mpiexec -np ${num_jobs} python -m sms_wsj.database.create_rirs database_path=$(RIR_DIR) sample_rate=16000 filter_length=16384 (Maybe filter_length=None will also work and use an automatic filter length and is hence faster) in the Makefile and then execute the command to generate the RIRs.

Next, you have to change sample_rate=8000 to sample_rate=16000 in the Makefile.

The rest should then work, except the ASR code. But for ASR on 16 kHz, there are nowadays many easy to use models.

Best regards, Christoph