jonashaag / speech-enhancement

Collection of papers, datasets and tools on the topic of Speech Dereverberation and Speech Enhancement
23 stars 9 forks source link

Excuse me that I ask a question about DCUNetTorchSound here? #2

Open iou2much opened 4 years ago

iou2much commented 4 years ago

Hi, Sir. Might I ask a question about DCUNetTorchSound in your another repo? Thank you.

I notice your custom dataset, MyFastDataset: https://github.com/jonashaag/DCUNetTorchSound/blob/master/src/ds2.py#L109

In this class, it only return noisy, sources for each iterator.

But it needs waveform_noise for the loss, right? How does it work here?

                    loss = loss_sdr(output=estimated_sound,
                                    signal_with_noise=waveform_sound_noise,
                                    target_signal=waveform,
                                    noise=waveform_noise)
jonashaag commented 4 years ago

Hi! You are right. I use a different loss that doesn't require the noise (SI-SDR or STOI). I'm not sure if it's possible to get a noise-only source for the kind of training I do (dereverberation, where noise-only would be some kind of reverberation-only)

iou2much commented 4 years ago

Thank you for your reply.

I've tried your recent PR for asteroid, and got these results for now: I use SingleSrcNegSDR for loss function and the arch DCUNet-20, training on dns_challenge in the egs of asteroid. After epoch 80, the loss is stumbling around -24.0. Does it make sense to you ?

image

Quite a nice work by the way 👍

jonashaag commented 4 years ago

Nice! I think that is a very good result. Please share the pretrained model with Asteroid if you can! Or if you don’t have time just upload it here and I will do for you

jonashaag commented 4 years ago

Btw your experiment folder says it’s large DCUNet but it’s not :-)

iou2much commented 4 years ago

Thank you. Let me try to upload the model here, as I'm quite new to Asteroid.

Btw, the folder name is large DCUNet because I tried it before, and manually set the tag value in the run.sh. But it was too slow on my server, and I gave it up eventually. So it is just DCUNet-20 actually :)

Also, might I ask, in your experience, is SingleSrcNegSDR the best loss function for DCUNet? or did you try something different ?

iou2much commented 4 years ago

Oh, here's one more thing about the trained model. I try it with some real record data. And it denoise the DNS background noises quite well, but in these data there's not much with reverberation. So I test it with my own data which is recorded in a classroom, with quite a lot reverberation. And I find out it's not working well in dereverberating, and even hurting some far-field speech somehow.

I suspect it's due to the training data, there're not much dereverberation data in DNS? Next I might try to use pyroomacoustics to simulate more data close to my scenario. image

jonashaag commented 4 years ago

Join the Asteroid Slack let's discuss there!

jonashaag commented 4 years ago

Btw I recommend to use real RIRs instead of simulated ones. Also are you looking for denoising in reverberated condition or for dereverberation? Very different things

jonashaag commented 4 years ago

Also have a look at the training data generation/augmentation techniques described here https://arxiv.org/pdf/2008.04470.pdf