Closed JustWon closed 7 years ago
I could get a "encoded_representation" thanks to the article. At this point, I have another question. The shape of my input is (1, 128, 128, 3) but the shape of "encoded_representation" is (1, 130, 130, 32). It is way bigger than the input... I thought the Convolution Autoencoder can produce the reduced dimension of an image. Am I correct? Thanks for your generous support.
You got a bigger size of the feature map produced because a CAE that does not use a transpose 2d convolution to "decode" its input requires that the dimension of the feature map to decode has enough border to be removed by the convolution operation to produce a reconstruction with the same size of the input image.
In your case this means that if the input image is (128,128,3) when you "encode" it, you first pad this image by an amount of zeros such as it's output is (130,130,D), because you need that the output of the decoding convolution is of (128,128, 3).
Read this: https://pgaleone.eu/neural-networks/2016/11/24/convolutional-autoencoders/#encode
This figure represents what I understood... Am I right?
Your input is first padded in order to have the right size. So 128 becomes 132 (2 zeros per side) -> convolution -> 130 -> convolution -> 128 (output).
By the way, you sholdn't look at the output spatial extent (in this case 130x130), because what are you learning are the convolutional filters that are 3x3x32 -> this is the "compressed version" of you image, these parameters learn to represent the input in a low-dimensional space (space of dimension 3x3x32)
The compressed version of my image, 3x3x32, is exactly what I want to know. But "encoded_representation" variable from #6 is not the case. Its size is 130x130x32. Do I need to do some post-processing on "encoded_representation" to get the compressed feature of the image?
Convolutional encoder do not compress the whole image in a lower dimensional space but instead they compress local patches (where the convolution is applied) of the input image. If you need a compress representation of the whole input image you don't have to use a convolutional autoencoder but instead use a fully connected autoencoder that can capture the entire content of the input image and not only the content of small patches
On Tue, Apr 18, 2017, 5:32 PM Dong-Won Shin notifications@github.com wrote:
The compressed version of my image, 3x3x32, is exactly what I want to know. But "encoded_representation" variable from #6 https://github.com/galeone/dynamic-training-bench/issues/6 is not the case. Its size is 130x130x32. Do I need to do some post-processing on "encoded_representation" to get the compressed feature of the image?
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/galeone/dynamic-training-bench/issues/10#issuecomment-294883102, or mute the thread https://github.com/notifications/unsubscribe-auth/AICZDKykO4tuI8pRWTQsxrqH70dd-OQ9ks5rxNeZgaJpZM4M_OnR .
--
Paolo Galeone
FYI: I simplified the feature extraction part, see #13
Hello, I could put my training data to the CAE model and I trained the model. After training, how can I extract a feature for an image? In the SingleLayerCAE.py, there was the "get" function. and I found out that "encoding" variable in the "encode" namespace. Is this related?