Closed nebw closed 8 years ago
I tried to implement a Convolutional VAE but it didn't improve my results on MNIST (compared to a 2-layer MLP).
One possible issue is that I'm using regular convolutions instead of deconvolutions on the decoder.
Does anyone know how to use a deconvolutional layer in Lasagne?
Hi
@nebw: You can just use the convolution layers from lasagne to do this. I've did this myself and was able to get 18% on the CIFAR10 data. However the proper way to do this would be to use proper deconvolution layers in the decoder part (see below)
@tabacof: As i wrote above i was able to get around 18% on CIFAR 10 with lasagnes conv-layers in both the encoder and decoder. You should use backward-convolution / fractional-striding to upscale the images, however I have not worked with that my-self. Some of my collaborators used them in "http://arxiv.org/pdf/1512.09300v1.pdf", I'll try to get them to chime in with more informed inputs on how to do this:)
Casper
I haven't used Theano for a long time, but Alec Radfords deconv() seems to do the job. That is, fractionally strided convolution / backward convolution.
@casperkaae: Would you mind posting your code for that? For some reason I wasn't able to get it to work with lasagne's convolutional layers when I tried it recently. I'm sure I'll figure it out eventually, but if it's not too much effort for you I would greatly appreciate it.
Hi @nebw and @tabacof
Sorry for the late answer. I'm using the implementation by Jan Schlüter of the backward convolution - by a somewhat misnomer called deconvolution.
class Deconv2DLayer(lasagne.layers.Layer):
def __init__(self, incoming, num_filters, filter_size, stride=1, pad=0,
nonlinearity=lasagne.nonlinearities.rectify, **kwargs):
super(Deconv2DLayer, self).__init__(incoming, **kwargs)
self.num_filters = num_filters
self.filter_size = lasagne.utils.as_tuple(filter_size, 2, int)
self.stride = lasagne.utils.as_tuple(stride, 2, int)
self.pad = lasagne.utils.as_tuple(pad, 2, int)
self.W = self.add_param(lasagne.init.Orthogonal(),
(self.input_shape[1], num_filters) + self.filter_size,
name='W')
self.b = self.add_param(lasagne.init.Constant(0),
(num_filters,),
name='b')
if nonlinearity is None:
nonlinearity = lasagne.nonlinearities.identity
self.nonlinearity = nonlinearity
def get_output_shape_for(self, input_shape):
shape = tuple(i*s - 2*p + f - 1
for i, s, p, f in zip(input_shape[2:],
self.stride,
self.pad,
self.filter_size))
return (input_shape[0], self.num_filters) + shape
def get_output_for(self, input, **kwargs):
op = T.nnet.abstract_conv.AbstractConv2d_gradInputs(
imshp=self.output_shape,
kshp=(self.input_shape[1], self.num_filters) + self.filter_size,
subsample=self.stride, border_mode=self.pad)
conved = op(self.W, input, self.output_shape[2:])
if self.b is not None:
conved += self.b.dimshuffle('x', 0, 'x', 'x')
return self.nonlinearity(conved)
A Conv-VAE model is then
l_in = InputLayer((None,h,w,c))
l_ds = DimshuffleLayer(l_in,(0,3,1,2))
l_conv1 = batch_norm(Conv2DLayer(l_ds,num_filters=64,filter_size=3,stride=2,pad='same'))
l_conv2 = batch_norm(Conv2DLayer(l_conv1,num_filters=128,filter_size=3,stride=2,pad='same'))
l_conv2code = batch_norm(DenseLayer(l_conv2,num_units=512))
l_mu = batch_norm(DenseLayer(l_conv2code, num_units=latent_size, nonlinearity=lasagne.nonlinearities.identity, name='ENC_MU'))
l_log_var = batch_norm(DenseLayer(l_conv2code, num_units=latent_size, nonlinearity=lasagne.nonlinearities.identity, name='ENC_LOG_VAR'))
#sample layer
l_z = SampleLayer(mean=l_mu, log_var=l_log_var, eq_samples=sym_eq_samples, iw_samples=sym_iw_samples)
_,cd,hd,wd = l_conv2.output_shape
l_code2conv = batch_norm(DenseLayer(l_z,num_units=cd*hd*wd))
l_code2conv = ReshapeLayer(l_code2conv,(-1,cd,hd,wd))
l_deconv2 = batch_norm(Deconv2DLayer(l_code2conv,num_filters=128,filter_size=3,stride=2,pad=1))
l_deconv1 = batch_norm(Deconv2DLayer(l_deconv2,num_filters=64,filter_size=3,stride=2,pad=1))
l_dec_x_mu = Conv2DLayer(l_deconv1,num_filters=c,filter_size=3,stride=1,pad='same')
#l_dec_x_mu = Conv2DLayer(l_deconv1,num_filters=c,filter_size=3,stride=1,pad='same',nonlinearity=lasagne.nonlinearities.sigmoid)
l_dec_x_mu = DimshuffleLayer(l_dec_x_mu,(0,2,3,1))
Are there plans to add convolutional layers to parmesan. I would like to reproduce the results of Rasmus et al 2015 and found parmesan very helpful to get me started.
Is anyone working on this yet? Otherwise, I'd give it a try myself.