ibab / tensorflow-wavenet

A TensorFlow implementation of DeepMind's WaveNet paper
MIT License
5.41k stars 1.29k forks source link

Accessing hidden layer output #236

Open codaich opened 7 years ago

codaich commented 7 years ago

Is it possible to access the output of inner network layers using this codebase and, if so, how?

I ask because we’re interested in 1) training a network on certain kinds of audio and then using the trained network to generate new similar audio (much like the babbling or classical piano generation described in https://deepmind.com/blog/wavenet-generative-model-raw-audio/) and 2) seeing what effect each network layer is having on the audio that is generated (much like Google did with respect to images in https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html, where they “pick a layer and ask the network to enhance whatever it detected”).

Related to this, is there a particular network architecture that might be suited to doing this? For example, would inner layers to be examined this way need to be of the same size as the final layer?

Thanks!

veqtor commented 7 years ago

Short answer: Yes... Maybe...

Longer answer: Perhaps, compiling it to a portable model and using that to loop over data could be a solution What does this mean in practice? If we were to use scalar input, you could use the batch loss function on an entire audio clip, initialised with noise, and do gradient ascent that activates a certain layer/neuron. It would basically be the same as: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/deepdream/deepdream.ipynb

However, since wavenet is recursive, perhaps what could yield interesting results would be to do style transfer somehow. I don't really understand the process well enough to be able to implement it but I'm guessing layers 5-20'ish should have "timbral style" information.

Another interesting thing to do would be to adjust, using gradient ascent, an input (noise or sound file) to maximize output matching another sound-file given various global conditions etc

codaich commented 7 years ago

Thanks, much appreciated.

ujal commented 6 years ago

@codaich Did anything interesting come out of maximizing the activity of hidden layers/neurons?