meokz / looking-to-listen

Deep neural network (DNN) for noise reduction, removal of background music, and speech separation
MIT License
169 stars 19 forks source link

Example of output #6

Open nanometer34688 opened 4 years ago

nanometer34688 commented 4 years ago

Hi,

Is it possible to see what kind of output your model has produced?

I have implemented some other repos looking at the same issue and it seems that output doesn't seem to output audio files that are easy to understand.

Do you have a couple of inputs and the outputs that you are willing to share?

Thanks

meokz commented 4 years ago

The input noisy audio (1D) is transformed by STFT into a time-frequency map (2D). The network inputs the TF map and outputs a masking TF map. We multiply the input noise map with the mask map generated by the network to obtain a clean map (2D). Finally, clean map is converted to the audio signal (1D) by iSTFT .