Open nanometer34688 opened 4 years ago
The input noisy audio (1D) is transformed by STFT into a time-frequency map (2D). The network inputs the TF map and outputs a masking TF map. We multiply the input noise map with the mask map generated by the network to obtain a clean map (2D). Finally, clean map is converted to the audio signal (1D) by iSTFT .
Hi,
Is it possible to see what kind of output your model has produced?
I have implemented some other repos looking at the same issue and it seems that output doesn't seem to output audio files that are easy to understand.
Do you have a couple of inputs and the outputs that you are willing to share?
Thanks