Open dedoogong opened 7 years ago
Hi,
As for 1 in your custom layer you should
As for 2: I think that abstract here is supposed to mean "non-standard convolution" rather than abstract in the object-oriented programming meaning.
As for 3: never tried these before, but maybe you could try the standard glorot initialization?
Marek
Hi, heavy questioner is back.
today, I tried to apply popular separable dw conv layer introduced in theano 0.10.0(but not in lasagne) to DAN, but as I'm a newbie in theano, I got lost....
below is my custom layer code for "SeparableDepthWiseConvolutionLayer"
[ SeparableDepthWiseConvolutionLayer.py ]
and then, I replaced all existing lasagnes' conv+batchnorm layers in createCNN() with the new layer as below;
[ DANtraining.py ]
from SepDWConvLayer import SeparableDepthWiseConvolutionLayer as SPDWConv ... ... def createCNN(self): ... ...
net['s1_conv1_1'] = batch_norm(Conv2DLayer(net['input'], 64, 3, pad='same', W=GlorotUniform('relu'))) ->> net['s1_conv1_1'] = batch_norm(SPDWConv(net['s1_conv0_1'], net['s1_conv0_1'].output_shape, 32, 64,[3, 3], stride=[1, 1] )) net['s1_conv1_2'] = batch_norm(Conv2DLayer(net['s1_conv1_1'], 64, 3, pad='same', W=GlorotUniform('relu'))) ->> net['s1_conv1_2'] = batch_norm(SPDWConv(net['s1_conv1_1'], net['s1_conv1_1'].output_shape, 64, 128,[3, 3], stride=[2, 2] )) .. .. .. so on.
so the createCNN() looks like :
my questions are,
( I'm referencing the theano test code "test_abstract_conv.py", https://github.com/Theano/Theano/blob/8dccbe6e1000239f57006e556fe8f737bb717aba/theano/tensor/nnet/tests/test_abstract_conv.py
There is def test_interface2d(self): in line 1683, and they test it with real numpy array values for input and deptwise/pointwise filters...
self.x = np.array([[[[1, 2, 3, 4, 5], [3, 2, 1, 4, 5], [3, 3, 1, 3, 6], [5, 3, 2, 1, 1], [4, 7, 1, 2, 1]], [[3, 3, 1, 2, 6], [6, 5, 4, 3, 1], [3, 4, 5, 2, 3], [6, 4, 1, 3, 4], [2, 3, 4, 2, 5]]]]).astype(theano.config.floatX) self.depthwise_filter = np.array([[[[3, 2, 1], [5, 3, 2], [6, 4, 2]]], [[[5, 5, 2], [3, 7, 4], [3, 5, 4]]], [[[7, 4, 7], [5, 3, 3], [1, 3, 1]]], [[[4, 4, 4], [2, 4, 6], [0, 0, 7]]]]).astype(theano.config.floatX) self.pointwise_filter = np.array([[[[4]], [[1]], [[3]], [[5]]], [[[2]], [[1]], [[2]], [[8]]]]).astype(theano.config.floatX) x_sym = theano.tensor.tensor4('x') dfilter_sym = theano.tensor.tensor4('d') pfilter_sym = theano.tensor.tensor4('p') sep_op = separable_conv2d(x_sym, dfilter_sym, pfilter_sym, self.x.shape[1]) fun = theano.function([x_sym, dfilter_sym, pfilter_sym], sep_op, mode='FAST_RUN') top = fun(self.x, self.depthwise_filter, self.pointwise_filter)
but in my code, I'm passing "input"(is it TensorVariable?) to the theano.function
def get_output_for(self, input, kwargs): x_sym = theano.tensor.tensor4('x') dfilter_sym = theano.tensor.tensor4('d') pfilter_sym = theano.tensor.tensor4('p') sep_op = separable_conv2d(x_sym , dfilter_sym , pfilter_sym , ...) fun = theano.function([x_sym, dfilter_sym, pfilter_sym], sep_op, mode='FAST_RUN') output = fun(input** ,self.depthwise_filters, self.pointwise_filters) return output
another try also failed :
... ... def get_output_for(self, input, **kwargs): x_sym = theano.tensor.tensor4('x_sym') dfilter_sym = theano.tensor.tensor4('dfilter_sym') pfilter_sym = theano.tensor.tensor4('pfilter_sym') sep_op = separable_conv2d(x_sym , dfilter_sym, pfilter_sym,... ) fun = theano.function([x_sym, dfilter_sym, pfilter_sym], sep_op, mode='FAST_RUN') output = fun(self.input ,self.depthwise_filters, self.pointwise_filters) return output
error : x_sym = theano.tensor.tensor4('x_sym') float() argument must be a string or a number
is there a way to pass the real value to theano.function to avoid this error? I thouht, I should use symbols to build a graph for compile.
do you think the theano's new separable_conv2d op(which uses Abstract2D class) can replace the existing Conv2DLayer as I did? "Abstract" seems to me, it presents just an interface, so user should implement the actual mothod. But when I followed the theano codes, it seems there is an actual implementation for depthwise + pointwise conv in abtract_conv.py (https://github.com/Theano/Theano/blob/4d46e410bc765e9e288996c7da693146df69e3b9/theano/tensor/nnet/abstract_conv.py).
what method could you suggest for initializing depthwise/pointwise weight?
thank you in advance!