galeone / dynamic-training-bench

Simplify the training and tuning of Tensorflow models
Mozilla Public License 2.0
213 stars 31 forks source link

Feature extraction from the convolutional autoencoder #10

Closed JustWon closed 7 years ago

JustWon commented 7 years ago

Hello, I could put my training data to the CAE model and I trained the model. After training, how can I extract a feature for an image? In the SingleLayerCAE.py, there was the "get" function. and I found out that "encoding" variable in the "encode" namespace. Is this related?

galeone commented 7 years ago

See this: https://github.com/galeone/dynamic-training-bench/issues/6

JustWon commented 7 years ago

I could get a "encoded_representation" thanks to the article. At this point, I have another question. The shape of my input is (1, 128, 128, 3) but the shape of "encoded_representation" is (1, 130, 130, 32). It is way bigger than the input... I thought the Convolution Autoencoder can produce the reduced dimension of an image. Am I correct? Thanks for your generous support.

galeone commented 7 years ago

You got a bigger size of the feature map produced because a CAE that does not use a transpose 2d convolution to "decode" its input requires that the dimension of the feature map to decode has enough border to be removed by the convolution operation to produce a reconstruction with the same size of the input image.

In your case this means that if the input image is (128,128,3) when you "encode" it, you first pad this image by an amount of zeros such as it's output is (130,130,D), because you need that the output of the decoding convolution is of (128,128, 3).

Read this: https://pgaleone.eu/neural-networks/2016/11/24/convolutional-autoencoders/#encode

JustWon commented 7 years ago

image This figure represents what I understood... Am I right?

galeone commented 7 years ago

Your input is first padded in order to have the right size. So 128 becomes 132 (2 zeros per side) -> convolution -> 130 -> convolution -> 128 (output).

By the way, you sholdn't look at the output spatial extent (in this case 130x130), because what are you learning are the convolutional filters that are 3x3x32 -> this is the "compressed version" of you image, these parameters learn to represent the input in a low-dimensional space (space of dimension 3x3x32)

JustWon commented 7 years ago

The compressed version of my image, 3x3x32, is exactly what I want to know. But "encoded_representation" variable from #6 is not the case. Its size is 130x130x32. Do I need to do some post-processing on "encoded_representation" to get the compressed feature of the image?

galeone commented 7 years ago

Convolutional encoder do not compress the whole image in a lower dimensional space but instead they compress local patches (where the convolution is applied) of the input image. If you need a compress representation of the whole input image you don't have to use a convolutional autoencoder but instead use a fully connected autoencoder that can capture the entire content of the input image and not only the content of small patches

On Tue, Apr 18, 2017, 5:32 PM Dong-Won Shin notifications@github.com wrote:

The compressed version of my image, 3x3x32, is exactly what I want to know. But "encoded_representation" variable from #6 https://github.com/galeone/dynamic-training-bench/issues/6 is not the case. Its size is 130x130x32. Do I need to do some post-processing on "encoded_representation" to get the compressed feature of the image?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/galeone/dynamic-training-bench/issues/10#issuecomment-294883102, or mute the thread https://github.com/notifications/unsubscribe-auth/AICZDKykO4tuI8pRWTQsxrqH70dd-OQ9ks5rxNeZgaJpZM4M_OnR .

--

Paolo Galeone

galeone commented 7 years ago

FYI: I simplified the feature extraction part, see #13