casperkaae / parmesan

Variational and semi-supervised neural network toppings for Lasagne
Other
208 stars 31 forks source link

Convolutional layers #25

Closed nebw closed 8 years ago

nebw commented 8 years ago

Are there plans to add convolutional layers to parmesan. I would like to reproduce the results of Rasmus et al 2015 and found parmesan very helpful to get me started.

Is anyone working on this yet? Otherwise, I'd give it a try myself.

tabacof commented 8 years ago

I tried to implement a Convolutional VAE but it didn't improve my results on MNIST (compared to a 2-layer MLP).

One possible issue is that I'm using regular convolutions instead of deconvolutions on the decoder.

Does anyone know how to use a deconvolutional layer in Lasagne?

casperkaae commented 8 years ago

Hi

@nebw: You can just use the convolution layers from lasagne to do this. I've did this myself and was able to get 18% on the CIFAR10 data. However the proper way to do this would be to use proper deconvolution layers in the decoder part (see below)

@tabacof: As i wrote above i was able to get around 18% on CIFAR 10 with lasagnes conv-layers in both the encoder and decoder. You should use backward-convolution / fractional-striding to upscale the images, however I have not worked with that my-self. Some of my collaborators used them in "http://arxiv.org/pdf/1512.09300v1.pdf", I'll try to get them to chime in with more informed inputs on how to do this:)

Casper

andersbll commented 8 years ago

I haven't used Theano for a long time, but Alec Radfords deconv() seems to do the job. That is, fractionally strided convolution / backward convolution.

nebw commented 8 years ago

@casperkaae: Would you mind posting your code for that? For some reason I wasn't able to get it to work with lasagne's convolutional layers when I tried it recently. I'm sure I'll figure it out eventually, but if it's not too much effort for you I would greatly appreciate it.

casperkaae commented 8 years ago

Hi @nebw and @tabacof

Sorry for the late answer. I'm using the implementation by Jan Schlüter of the backward convolution - by a somewhat misnomer called deconvolution.

class Deconv2DLayer(lasagne.layers.Layer):

    def __init__(self, incoming, num_filters, filter_size, stride=1, pad=0,
            nonlinearity=lasagne.nonlinearities.rectify, **kwargs):
        super(Deconv2DLayer, self).__init__(incoming, **kwargs)
        self.num_filters = num_filters
        self.filter_size = lasagne.utils.as_tuple(filter_size, 2, int)
        self.stride = lasagne.utils.as_tuple(stride, 2, int)
        self.pad = lasagne.utils.as_tuple(pad, 2, int)
        self.W = self.add_param(lasagne.init.Orthogonal(),
                (self.input_shape[1], num_filters) + self.filter_size,
                name='W')
        self.b = self.add_param(lasagne.init.Constant(0),
                (num_filters,),
                name='b')
        if nonlinearity is None:
            nonlinearity = lasagne.nonlinearities.identity
        self.nonlinearity = nonlinearity

    def get_output_shape_for(self, input_shape):
        shape = tuple(i*s - 2*p + f - 1
                for i, s, p, f in zip(input_shape[2:],
                                      self.stride,
                                      self.pad,
                                      self.filter_size))
        return (input_shape[0], self.num_filters) + shape

    def get_output_for(self, input, **kwargs):
        op = T.nnet.abstract_conv.AbstractConv2d_gradInputs(
            imshp=self.output_shape,
            kshp=(self.input_shape[1], self.num_filters) + self.filter_size,
            subsample=self.stride, border_mode=self.pad)
        conved = op(self.W, input, self.output_shape[2:])
        if self.b is not None:
            conved += self.b.dimshuffle('x', 0, 'x', 'x')
        return self.nonlinearity(conved)

A Conv-VAE model is then

l_in = InputLayer((None,h,w,c))
l_ds = DimshuffleLayer(l_in,(0,3,1,2))
l_conv1 = batch_norm(Conv2DLayer(l_ds,num_filters=64,filter_size=3,stride=2,pad='same'))
l_conv2 = batch_norm(Conv2DLayer(l_conv1,num_filters=128,filter_size=3,stride=2,pad='same'))
l_conv2code = batch_norm(DenseLayer(l_conv2,num_units=512))

l_mu = batch_norm(DenseLayer(l_conv2code, num_units=latent_size, nonlinearity=lasagne.nonlinearities.identity, name='ENC_MU'))
l_log_var = batch_norm(DenseLayer(l_conv2code, num_units=latent_size, nonlinearity=lasagne.nonlinearities.identity, name='ENC_LOG_VAR'))

#sample layer
l_z = SampleLayer(mean=l_mu, log_var=l_log_var, eq_samples=sym_eq_samples, iw_samples=sym_iw_samples)

_,cd,hd,wd = l_conv2.output_shape
l_code2conv = batch_norm(DenseLayer(l_z,num_units=cd*hd*wd))
l_code2conv = ReshapeLayer(l_code2conv,(-1,cd,hd,wd))
l_deconv2 = batch_norm(Deconv2DLayer(l_code2conv,num_filters=128,filter_size=3,stride=2,pad=1))
l_deconv1 = batch_norm(Deconv2DLayer(l_deconv2,num_filters=64,filter_size=3,stride=2,pad=1))
l_dec_x_mu = Conv2DLayer(l_deconv1,num_filters=c,filter_size=3,stride=1,pad='same')
#l_dec_x_mu = Conv2DLayer(l_deconv1,num_filters=c,filter_size=3,stride=1,pad='same',nonlinearity=lasagne.nonlinearities.sigmoid)
l_dec_x_mu = DimshuffleLayer(l_dec_x_mu,(0,2,3,1))