Zardinality / TF-deformable-conv

Implementation of deformable convolution as an operation in tensorflow
Apache License 2.0
111 stars 30 forks source link

How to convert the mxnet code to your code #6

Open John1231983 opened 6 years ago

John1231983 commented 6 years ago

This is not an issue because the code worked fine in TF 1.2 and cudnn 5.1 In this question, I want to ask about how can I convert the mxnet code using your implementation. As shown in the line 678, we have

res5a_branch2b_offset_weight = mx.symbol.Variable('res5a_branch2b_offset_weight', lr_mult=1.0)
res5a_branch2b_offset_bias = mx.symbol.Variable('res5a_branch2b_offset_bias', lr_mult=2.0)
res5a_branch2b_offset = mx.symbol.Convolution(name='res5a_branch2b_offset', data = res5a_branch2a_relu,  num_filter=18, pad=(1, 1), kernel=(3, 3), stride=(1, 1),weight=res5a_branch2b_offset_weight, bias=res5a_branch2b_offset_bias)
res5a_branch2b = mx.contrib.symbol.DeformableConvolution(name='res5a_branch2b', data=res5a_branch2a_relu, offset=res5a_branch2b_offset,num_filter=512, pad=(2, 2), kernel=(3, 3), num_deformable_group=1, stride=(1, 1), dilate=(2, 2), no_bias=True)

How can I convert 4 above lines using deform_conv_op.deform_conv_op? I read the demo.py, test_deform_conv.py, and this is my current coverting

import tensorflow.contrib.layers as ly
from lib.deform_conv_op import deform_conv_op
res5a_branch2b_offset = ly.conv2d(res5a_branch2a_relu, num_outputs=18, kernel_size=3,  stride=2, activation_fn=None, data_format='NHWC')
num_x = res5a_branch2a_relu.shape[self.channel_axis].value
res5a_branch2b_kernel= tf.get_variable('weights', shape=[3, 3, num_x, 512])
res5a_branch2b = deform_conv_op(res5a_branch2a_relu, filter=o_b2b_kernel, offset=o_b2b_offset,
                                        rates=[1, 2, 2, 1], padding="SAME", strides=[1, 1, 1, 1],
                                        num_groups=1, deformable_group=1, name='%s/bottleneck_v1/conv2' % name)

Note that, above converting used NHWC order and still missing two first lines

res5a_branch2b_offset_weight = mx.symbol.Variable('res5a_branch2b_offset_weight', lr_mult=1.0)
res5a_branch2b_offset_bias = mx.symbol.Variable('res5a_branch2b_offset_bias', lr_mult=2.0)

And also got the error

ValueError: Deformconv requires the offset compatible with filter, but got: [4,64,64,18] for 'resnet_v1_101/block4/unit_1/bottleneck_v1/conv2' (op: 'DeformConvOp') with input shapes: [4,64,64,512], [3,3,512,512], [4,64,64,18].

John1231983 commented 6 years ago

I think the problem is shape order. It needs order as NxCxHxW, instead of NxHxWxC in tensorflow. After obtained the result of deformable convolution using NxCxHxW, I need to reorder again to NxHxWxC as tensorflow format. Am I right?

Zardinality commented 6 years ago

It seems deform_conv_op.deform_conv_op assume every input to be NCHW order, you could refer to the faster-rcnn version deformable convolution over here. By the way, the script test_deform_conv.py shows a sample call to deform_conv_op which involves full parameter list and actual shape, hope it could help you understand how this op works.

John1231983 commented 6 years ago

Thanks for your help. I checked the code that confirmed only support NCHW format. Sorry, I did not read it before. For the second question, do you think is it necessary to initialize the weight with lr_mult=1.0, and lr_mult=2.0. I did not find your code like that

res5b_branch2b_offset_weight = mx.symbol.Variable('res5b_branch2b_offset_weight', lr_mult=1.0)
res5b_branch2b_offset_bias = mx.symbol.Variable('res5b_branch2b_offset_bias', lr_mult=2.0)
Zardinality commented 6 years ago

@John1231983 I am not familiar with the mxnet context, what does the lr_mult represent? learning rate multiplier? Or linear decay coefficient?

John1231983 commented 6 years ago

I guess this is learning rate multiplier for bias and weight.

Zardinality commented 6 years ago

@John1231983 I don't think I did. If you are referring to the code of tf_deform_net, you could either dirty hack it in here, or find if there is any flag to create var in tensorflow and mend it here.

John1231983 commented 6 years ago

Thanks to your great direction. I have completed it and it worked well. However, I have one question for offset. In traditional implementation, the offset did not include rate/dilated parameter although the deformable convolution has it. For example, the you can see it in the deeplab


res5c_branch2b_offset = mx.symbol.Convolution(name='res5c_branch2b_offset', data = res5c_branch2a_relu, num_filter=18, pad=(1, 1), kernel=(3, 3), stride=(1, 1), weight=res5c_branch2b_offset_weight, bias=res5c_branch2b_offset_bias)
res5c_branch2b = mx.contrib.symbol.DeformableConvolution(name='res5c_branch2b', data=res5c_branch2a_relu, offset=res5c_branch2b_offset,num_filter=512, pad=(2, 2), kernel=(3, 3), num_deformable_group=1,stride=(1, 1), dilate=(2, 2), no_bias=True)

However, in your implementation for Faster RCNN I found

conv(3, 3, 72, 1, 1, biased=True, rate=2, relu=False, name='res5a_branch2b_offset', padding='SAME', initializer='zeros'))
 (self.feed('res5a_branch2a_relu', 'res5a_branch2b_offset')
 .deform_conv(3, 3, 512, 1, 1, biased=False, rate=2, relu=False, num_deform_group=4, name='res5a_branch2b')

It shows that the offset needs to include rate to make it consistent with its deformable convolution. Do we need to consider rate in the offset?

Zardinality commented 6 years ago

@John1231983 I am not sure whether the rate in the offset stream is necessary. I remember I set all arguments according to the original implementation, so I did not have a particular reason for the rate stuff. You might very well to set it to 1 and see how it performs.

John1231983 commented 6 years ago

I think it is necessary. I guess the author is missing because each element of convolution will correspond to offset element. On other hands, the offset element controls the direction of convolution in deformation. When we use the dilation way, the convolution element already sparse and the offset element also needs to sparse to corporate together location by location