PaddlePaddle / PaddleDetection

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Apache License 2.0
12.62k stars 2.87k forks source link

[BUG]the nums of out channel is not equal to the declaration of the class in the darknet module #5018

Open Xinjun-Wu opened 2 years ago

Xinjun-Wu commented 2 years ago

the nums of out channel is not equal to the declaration of the class in the darknet module

LOCATION:

PaddleDetection/ppdet/modeling/backbones/darknet.py >>> class BasicBlock(nn.Layer)

VERSION

release 2.3

描述问题/Describe the bug

below is the definition codes about the BasicBlock

class BasicBlock(nn.Layer):
    def __init__(self,
                 ch_in,
                 ch_out,
                 norm_type='bn',
                 norm_decay=0.,
                 freeze_norm=False,
                 data_format='NCHW'):
        super(BasicBlock, self).__init__()

        self.conv1 = ConvBNLayer(
            ch_in=ch_in,
            ch_out=ch_out,
            filter_size=1,
            stride=1,
            padding=0,
            norm_type=norm_type,
            norm_decay=norm_decay,
            freeze_norm=freeze_norm,
            data_format=data_format)
        self.conv2 = ConvBNLayer(
            ch_in=ch_out,
            ch_out=ch_out * 2,
            filter_size=3,
            stride=1,
            padding=1,
            norm_type=norm_type,
            norm_decay=norm_decay,
            freeze_norm=freeze_norm,
            data_format=data_format)

    def forward(self, inputs):
        conv1 = self.conv1(inputs)
        conv2 = self.conv2(conv1)
        out = paddle.add(x=inputs, y=conv2)
        return out

Suppose that i have a input variable with shape [5,10,9,9] and a BasicBlock instance named block with the ch_in=10, ch_out=5, that mean i want have a output with shape [5,5,9,9], but the actually the output is a tensor with shape [5,10,9,9] Below is the codes that i have test in jupyternotebook

import os
import paddle
os.chdir("/home/PaddleDetection")
from ppdet.modeling.backbones.darknet import BasicBlock

input = paddle.rand([5,10,9,9])
block = BasicBlock(10,5)
output = block(input)
print(f"input shape : {input.shape}\n")
print(f"layer: \n{block}\n")
print(f"output shape : {output.shape}")

and the print result is:

input shape : [5, 10, 9, 9]

layer: 
BasicBlock(
  (conv1): ConvBNLayer(
    (conv): Conv2D(10, 5, kernel_size=[1, 1], data_format=NCHW)
    (batch_norm): BatchNorm2D(num_features=5, momentum=0.9, epsilon=1e-05)
  )
  (conv2): ConvBNLayer(
    (conv): Conv2D(5, 10, kernel_size=[3, 3], padding=1, data_format=NCHW)
    (batch_norm): BatchNorm2D(num_features=10, momentum=0.9, epsilon=1e-05)
  )
)

output shape : [5, 10, 9, 9]

and if I input variable with shape [5,10,9,9] to the block with the ch_in=10, ch_out=20 or the block with the ch_in=10, ch_out=20, there will encounter a ValuError:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_17986/1117496369.py in <module>
      6 input = paddle.rand([5,10,9,9])
      7 block = BasicBlock(10,10)
----> 8 output = block(input)
      9 output.shape
     10 print(f"input shape : {input.shape}\n")

/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py in __call__(self, *inputs, **kwargs)
    912                 self._built = True
    913 
--> 914             outputs = self.forward(*inputs, **kwargs)
    915 
    916             for forward_post_hook in self._forward_post_hooks.values():

/home/PaddleDetection/ppdet/modeling/backbones/darknet.py in forward(self, inputs)
    174         conv1 = self.conv1(inputs)
    175         conv2 = self.conv2(conv1)
--> 176         out = paddle.add(x=inputs, y=conv2)
    177         return out
    178 

/usr/local/lib/python3.7/dist-packages/paddle/tensor/math.py in add(x, y, name)
    238 
    239     if in_dygraph_mode():
--> 240         return _C_ops.elementwise_add(x, y)
    241 
    242     return _elementwise_op(LayerHelper('elementwise_add', **locals()))

ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [5, 10, 9, 9] and the shape of Y = [5, 20, 9, 9]. Received [10] in X is not equal to [20] in Y at i:1.
  [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at /paddle/paddle/fluid/operators/elementwise/elementwise_op_function.h:240)
  [operator < elementwise_add > error]

and

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_17986/3869671924.py in <module>
      6 input = paddle.rand([5,10,9,9])
      7 block = BasicBlock(10,20)
----> 8 output = block(input)
      9 print(f"input shape : {input.shape}\n")
     10 print(f"layer: \n{block}\n")

/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py in __call__(self, *inputs, **kwargs)
    912                 self._built = True
    913 
--> 914             outputs = self.forward(*inputs, **kwargs)
    915 
    916             for forward_post_hook in self._forward_post_hooks.values():

/home/PaddleDetection/ppdet/modeling/backbones/darknet.py in forward(self, inputs)
    174         conv1 = self.conv1(inputs)
    175         conv2 = self.conv2(conv1)
--> 176         out = paddle.add(x=inputs, y=conv2)
    177         return out
    178 

/usr/local/lib/python3.7/dist-packages/paddle/tensor/math.py in add(x, y, name)
    238 
    239     if in_dygraph_mode():
--> 240         return _C_ops.elementwise_add(x, y)
    241 
    242     return _elementwise_op(LayerHelper('elementwise_add', **locals()))

ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [5, 10, 9, 9] and the shape of Y = [5, 40, 9, 9]. Received [10] in X is not equal to [40] in Y at i:1.
  [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at /paddle/paddle/fluid/operators/elementwise/elementwise_op_function.h:240)
  [operator < elementwise_add > error]

I know that is a tiny issue, so would you please fix that in the feature, pleaseure

heavengate commented 2 years ago

ch_out config the output channel of conv1 currently, output channel number of conv2 is twice as conv1, if we set ch_out to config the output channel of conv2, which means output channel of conv1 will be ch_out / 2, this will introduce a restriction that chi_out should be a even number, which is not user friendly

Xinjun-Wu commented 2 years ago

@heavengate, thanks your explanations about the design of the block from the user view, but actually these is still a implicit restriction that user just allowed to init a block with ch_out is equal to the half of ch_in, which is result from the implementations of conv2 and forward function: You can see that conv2 implement a process that double the channels,

        self.conv2 = ConvBNLayer(
            ch_in=ch_out,
            ch_out=ch_out * 2,
            filter_size=3,
            stride=1,
            padding=1,
            norm_type=norm_type,
            norm_decay=norm_decay,
            freeze_norm=freeze_norm,
            data_format=data_format)

and the output of conv2 need have same shape with the input of the conv1 due to the use of the paddle.add operator not the paddle.concat operator.

    def forward(self, inputs):
        conv1 = self.conv1(inputs)
        conv2 = self.conv2(conv1)
        out = paddle.add(x=inputs, y=conv2) # inpus and conv2 should be the same shape
        return out

so if i give the ch_in = 10, ch_out = 6 , there will be a ValueError from the add operator

import os
import paddle
os.chdir("/home/PaddleDetection")
from ppdet.modeling.backbones.darknet import BasicBlock

input = paddle.rand([5,10,9,9])
block = BasicBlock(10,6)
output = block(input)
print(f"input shape : {input.shape}\n")
print(f"layer: \n{block}\n")
print(f"output shape : {output.shape}")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_17986/3074278531.py in <module>
      6 input = paddle.rand([5,10,9,9])
      7 block = BasicBlock(10,6)
----> 8 output = block(input)
      9 print(f"input shape : {input.shape}\n")
     10 print(f"layer: \n{block}\n")

/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py in __call__(self, *inputs, **kwargs)
    912                 self._built = True
    913 
--> 914             outputs = self.forward(*inputs, **kwargs)
    915 
    916             for forward_post_hook in self._forward_post_hooks.values():

/home/PaddleDetection/ppdet/modeling/backbones/darknet.py in forward(self, inputs)
    174         conv1 = self.conv1(inputs)
    175         conv2 = self.conv2(conv1)
--> 176         out = paddle.add(x=inputs, y=conv2)
    177         return out
    178 

/usr/local/lib/python3.7/dist-packages/paddle/tensor/math.py in add(x, y, name)
    238 
    239     if in_dygraph_mode():
--> 240         return _C_ops.elementwise_add(x, y)
    241 
    242     return _elementwise_op(LayerHelper('elementwise_add', **locals()))

ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [5, 10, 9, 9] and the shape of Y = [5, 12, 9, 9]. Received [10] in X is not equal to [12] in Y at i:1.
  [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at /paddle/paddle/fluid/operators/elementwise/elementwise_op_function.h:240)
  [operator < elementwise_add > error]

Even thought the design concept of the ResNet block have demand the same shape of x and f(f(x)) from the y = x+f(f(x)), i think is better to set the ch_in and ch_out to the same even value, and modify the channel variables of the conv1 and conv2 , which may be more friendly to user.*_^