vbelz / Speech-enhancement

Deep learning for audio denoising
MIT License
620 stars 124 forks source link

Question about the inputs and the outputs of the model #8

Open OlgaFomin opened 3 years ago

OlgaFomin commented 3 years ago

Hey, First of all your code is great! it worked for me and it is very simple and clear 👍 One question - in your model you used Xin to be spectogram(noisy_voice) and Xout is spectogram(noisy_voice) - spectogram(voice). I didn't understand why did you do the substruction so I tried to take Xout to be spectogram(voice), but then I got underfitted loss. Do you know why that happens?

Thanks again! Olga :)

vbelz commented 3 years ago

Hi Olga :)

thanks for the message!

Indeed, I am modelling: spectogram(noisy_voice) - spectogram(voice) as the "noise pattern in noisy_voice spectrogram" to subtract.

The reason I am doing that is that it is much simpler to model the noise to remove in the noisy_voice spectrogram that to directly try to reconstruct the clean voice. This might explain why it was problematic in you case.

Hope it will help,

kind regards,

Vincent

Le mer. 26 août 2020 à 13:26, OlgaFomin notifications@github.com a écrit :

Hey, First of all your code is great! it worked for me and it is very simple and clear 👍 One question - in your model you used Xin to be spectogram(noisy_voice) and Xout is spectogram(noisy_voice) - spectogram(voice). I didn't understand why did you do the substruction so I tried to take Xout to be spectogram(voice), but then I got underfitted loss. Do you know why that happens?

Thanks again! Olga :)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vbelz/Speech-enhancement/issues/8, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJHCRMEB5MCGH5NSVICLANLSCUZTBANCNFSM4QMARQ7A .