Closed mmxuan18 closed 5 years ago
Well, most of the methods in computer vision is to probe which region of the image may affect network predictions the most. I guess you can do similar things on spectrogram as well. But I never tried it myself.
Best, Weidi
in computer vision fields, there is some tools to visualize what the network learned for the final classification, such as gradcam/cam and so on in speaker recognition fields, how to analysis the output which activate the input, then i can say the network learn a good feature directly. what are the generality things in the input spectrograms for different context of the same speaker?
a example of grad-cam audio