infocusp / tf_cnnvis

CNN visualization tool in TensorFlow
MIT License
780 stars 208 forks source link

some question about this code #6

Closed baodingge closed 7 years ago

baodingge commented 7 years ago

creating placeholders to pass featuremaps and

        # creating gradient ops
        featuremap = [tf.placeholder(tf.int32) for i in range(config["N"])]
        reconstruct = [tf.gradients(tf.transpose(tf.transpose(op_tensor)[featuremap[i]]), X)[0] for i in range(config["N"])]

        # Execute the gradient operations in batches of 'n'
        for i in range(0, tensor_shape[-1], config["N"]):
            c = 0
            for j in range(config["N"]):
                if (i + j) < tensor_shape[-1]:
                    feed_dict[featuremap[j]] = i + j
                    c += 1
            if c > 0:
                out.extend(sess.run(reconstruct[:c], feed_dict = feed_dict))

I do not understand this code, why it is, i can not see deconv or Unpooling

falaktheoptimist commented 7 years ago

Thanks for asking this question. It'll be a useful resource for many future explorers. So, here's the explanation:

According to this paper by Dr. Zeiler, the deconvolution operation is approximated as convolution with transpose of kernel or the weight matrix (Section 2.1 in Filtering). In tensorflow, the gradient of convolution output w.r.t input is defined as the convolution of output with the transpose of the kernel. Here's the TF documentation regarding gradient operation. This makes deconvolution and gradient of convolution one and the same operation.

Unpooling operation is explained in the paper, Section 2.1: as follows :

In the convnet, the max pooling operation is non-invertible, however we can obtain an approximate inverse by recording the locations of the maxima within each pooling region in a set of switch variables. In the deconvnet, the unpooling operation uses these switches to place the reconstructions from the layer above into appropriate locations, preserving the structure of the stimulus.

We realized that this too was same as computing gradient of max pooling operation and the backpropogating the previous layer's gradients. The gradient of max operation is also a switch with 1 at the maxima and 0 elsewhere. Hence, using the gradient directly should work for max pooling too.

falaktheoptimist commented 7 years ago

Closing this now. Reopen if you've any further queries.

wstang35 commented 5 years ago

Hi falaktheoptimist!

Here is how I understand deconvolution operation, in order to get the reconstructed input, we should do activation * transpose of kernel.

So I am wondering why using "gradient" rather than "gradient activation", because gradient is just 1 transpose of kernel.

Hope you can see this.

And I am really appreciate for the nice project!

wstang35 commented 5 years ago

@falaktheoptimist @BhagyeshVikani