nanopony / keras-convautoencoder

Keras autoencoders (convolutional/fcc) [proof of concept]
174 stars 55 forks source link

Multi-channel input, single channel output #4

Open el3ment opened 8 years ago

el3ment commented 8 years ago

@nanopony : I have a stack of 8 previous frames I am using to predict the next single frame by passing in an input with 8 channels, but when specifying that I only want 1 channel output, the resulting network still produces an 8-channel output even though the summary table looks as though it will produce a 1 channel output.

I suspect this is because the deconvolution layer is only applying the transpose of the bound convolution layer without taking into account the nb_out_channels parameter. Is there some way to acheive the 8-channel input to 1-channel output deconvolution in this way?

Initial input shape: (None, 8, 64, 64)
--------------------------------------------------------------------------------
Layer (name)                  Output Shape                  Param #             
--------------------------------------------------------------------------------
ZeroPadding2D (zeropadding2d) (None, 8, 66, 66)             0                   
Convolution2D (convolution2d) (None, 64, 64, 64)            4672                
ZeroPadding2D (zeropadding2d) (None, 64, 66, 66)            0                   
Convolution2D (convolution2d) (None, 128, 32, 32)           73856               
ZeroPadding2D (zeropadding2d) (None, 128, 34, 34)           0                   
Convolution2D (convolution2d) (None, 32, 16, 16)            36896               
Flatten (flatten)             (None, 8192)                  0                   
Dense (dense)                 (None, 1024)                  8389632             
Dense (dense)                 (None, 1024)                  1049600             
Dense (dense)                 (None, 8192)                  8396800             
Reshape (reshape)             (None, 32, 16, 16)            0                   
Deconvolution2D (deconvolution(None, 128, 16, 16)           0                   
UpSampling2D (upsampling2d)   (None, 128, 32, 32)           0                   
Deconvolution2D (deconvolution(None, 64, 32, 32)            0                   
UpSampling2D (upsampling2d)   (None, 64, 64, 64)            0                   
Deconvolution2D (deconvolution(None, 1, 64, 64)             0                   
--------------------------------------------------------------------------------
Total params: 17951456
--------------------------------------------------------------------------------
model.predict(input).shape == (1, 8, 64, 64)
model_conv = Sequential()

model_conv.add(ZeroPadding2D(padding=(1, 1), input_shape=(8, 64, 64)))

model_conv.add(Convolution2D(64, 3, 3,
                        subsample=(1, 1),
                        border_mode='valid',
                        activation='relu'))

model_conv.add(ZeroPadding2D(padding=(1, 1)))

model_conv.add(Convolution2D(128, 3, 3,
                        subsample=(2, 2),
                        border_mode='valid',
                        activation='relu'))

model_conv.add(ZeroPadding2D(padding=(1, 1)))

model_conv.add(Convolution2D(32, 3, 3,
                        subsample=(2, 2),
                        border_mode='valid',
                        activation='relu'))

model_conv.add(Flatten())

model_conv.add(Dense(1024, init='uniform', activation='relu'))

model_conv.add(Dense(1024, init='uniform', activation='relu'))

model_conv.add(Dense(32*16*16, init='uniform', activation='relu'))

model_conv.add(Reshape((32, 16, 16)))

model_conv.add(Deconvolution2D(model_conv.layers[5], border_mode="same", nb_out_channels=128))

model_conv.add(UpSampling2D())

model_conv.add(Deconvolution2D(model_conv.layers[3], border_mode="same", nb_out_channels=64))

model_conv.add(UpSampling2D())

model_conv.add(Deconvolution2D(model_conv.layers[1], border_mode="same", nb_out_channels=1))

model_conv.summary()
nanopony commented 8 years ago

@el3ment I am sorry for the delay with response.

Yes, Deconvolution2D ignores nb_out_channels since it is basically (at some extent) inverse operation to Convolution: if conv layer takes (8, N, N) and produces (64, M, M) inverse must work in a opposite way -- get (64, M, M) and produce (8, N, N).