Open XDhughie opened 4 years ago
split previous layer (layers=-1) into two parts through channel (groups=2) and route the second part (group_id=1, id start from 0).
split previous layer (layers=-1) into two parts through channel (groups=2) and route the second part (group_id=1, id start from 0).
hey, can I ask what the meaning of channel here?
size of feature map is width height channel.
size of feature map is width height channel.
oh I see thanks. but how could we seperaste a layer into two parts, can you plz explain?
if width height channel is w h c, the feature map will be fm[0:w, 0:h, 0:c]. groups = 2, group_id = 0 will gets fm[0:w, 0:h, 0:c/2]. groups = 2, group_id = 1 will gets fm[0:w, 0:h, c/2:c].
@WongKinYiu I don't understand why (visually with Netron) the num. of c is 64
Shouldn't be 32 because c/2
?
maybe it because darknet use group_id but netron use groups_id. https://github.com/AlexeyAB/darknet/blob/master/src/parser.c#L1036 https://github.com/lutzroeder/netron/blob/master/src/darknet-metadata.json#L281
you can try to modify the code of netron and generate the graph again.
@WongKinYiu I just tried changing group_id
to groups_id
but they produce both the same result. I should open an issue in netron I guess
@WongKinYiu Hi
Yolov4-tiny only use the group_id=1? So many channel is thrown away?
no, cross stage connection routes all of channel of base layer in yolov4-tiny.
@WongKinYiu I can understand it. You mean that only this route layer uses half channel,but previous layer that uses all channel will be concat?
is right?
`class ResConv2dBatchLeaky(nn.Module):
def __init__(self, in_channels, inter_channels, kernel_size, stride=1, leaky_slope=0.1, return_extra=False):
super(ResConv2dBatchLeaky, self).__init__()
self.return_extra = return_extra
self.in_channels = in_channels
self.inter_channels = inter_channels
self.kernel_size = kernel_size
self.stride = stride
if isinstance(kernel_size, (list, tuple)):
self.padding = [int(ii / 2) for ii in kernel_size]
else:
self.padding = int(kernel_size / 2)
self.leaky_slope = leaky_slope
self.layers0 = Conv2dBatchLeaky(self.in_channels//2, self.inter_channels, self.kernel_size, self.stride,
self.padding)
self.layers1 = Conv2dBatchLeaky(self.inter_channels, self.inter_channels, self.kernel_size, self.stride,
self.padding)
self.layers2 = Conv2dBatchLeaky(self.in_channels, self.in_channels, 1, 1, 0)
def forward(self, x):
y0 = x
channel = x.shape[1]
x0 = x[:, channel // 2:, ...]
x1 = self.layers0(x0)
x2 = self.layers1(x1)
x3 = torch.cat((x2, x1), dim=1)
x4 = self.layers2(x3)
x = torch.cat((y0, x4), dim=1)
if self.return_extra:
return x, x4
else:
return x`
@WongKinYiu
The whole model structure is as follows? is right?
`class TinyYolov4(nn.Module):
def __init__(self, pretrained=False):
super(TinyYolov4, self).__init__()
# Network
backbone = OrderedDict([
('0_convbatch', vn_layer.Conv2dBatchLeaky(3, 32, 3, 2)),
('1_convbatch', vn_layer.Conv2dBatchLeaky(32, 64, 3, 2)),
('2_convbatch', vn_layer.Conv2dBatchLeaky(64, 64, 3, 1)),
('3_resconvbatch', vn_layer.ResConv2dBatchLeaky(64, 32, 3, 1)),
('4_max', nn.MaxPool2d(2, 2)),
('5_convbatch', vn_layer.Conv2dBatchLeaky(128, 128, 3, 1)),
('6_resconvbatch', vn_layer.ResConv2dBatchLeaky(128, 64, 3, 1)),
('7_max', nn.MaxPool2d(2, 2)),
('8_convbatch', vn_layer.Conv2dBatchLeaky(256, 256, 3, 1)),
('9_resconvbatch', vn_layer.ResConv2dBatchLeaky(256, 128, 3, 1, return_extra=True)),
])
head = [
OrderedDict([
('10_max', nn.MaxPool2d(2, 2)),
('11_conv', vn_layer.Conv2dBatchLeaky(512, 512, 3, 1)),
('12_conv', vn_layer.Conv2dBatchLeaky(512, 256, 1, 1)),
]),
OrderedDict([
('13_conv', vn_layer.Conv2dBatchLeaky(256, 512, 3, 1)),
('14_conv', nn.Conv2d(512, 3 * (5 + 80), 1)),
]),
OrderedDict([
('15_convbatch', vn_layer.Conv2dBatchLeaky(256, 128, 1, 1)),
('16_upsample', nn.Upsample(scale_factor=2)),
]),
OrderedDict([
('17_convbatch', vn_layer.Conv2dBatchLeaky(384, 256, 3, 1)),
('18_conv', nn.Conv2d(256, 3 * (5 + 80), 1)),
]),
]
self.backbone = nn.Sequential(backbone)
self.head = nn.ModuleList([nn.Sequential(layer_dict) for layer_dict in head])
self.init_weights(pretrained)
def forward(self, x):
stem, extra_x = self.backbone(x)
stage0 = self.head[0](stem)
head0 = self.head[1](stage0)
stage1 = self.head[2](stage0)
stage2 = torch.cat((stage1, extra_x), dim=1)
head1 = self.head[3](stage2)
head = [head1, head0]
return head`
i think yes.
class ResConv2dBatchLeaky(nn.Module):
def __init__(self, in_channels, inter_channels, kernel_size, stride=1, leaky_slope=0.1, return_extra=False):
super(ResConv2dBatchLeaky, self).__init__()
self.return_extra = return_extra
self.in_channels = in_channels
self.inter_channels = inter_channels
self.kernel_size = kernel_size
self.stride = stride
if isinstance(kernel_size, (list, tuple)):
self.padding = [int(ii / 2) for ii in kernel_size]
else:
self.padding = int(kernel_size / 2)
self.leaky_slope = leaky_slope
self.layers0 = Conv2dBatchLeaky(self.in_channels//2, self.inter_channels, self.kernel_size, self.stride,
self.padding)
self.layers1 = Conv2dBatchLeaky(self.inter_channels, self.inter_channels, self.kernel_size, self.stride,
self.padding)
self.layers2 = Conv2dBatchLeaky(self.in_channels, self.in_channels, 1, 1, 0)
def forward(self, x):
y0 = x
channel = x.shape[1]
x0 = x[:, channel // 2:, ...]
x1 = self.layers0(x0)
x2 = self.layers1(x1)
x3 = torch.cat((x2, x1), dim=1)
x4 = self.layers2(x3)
x = torch.cat((y0, x4), dim=1)
if self.return_extra:
return x, x4
else:
return x
How come v4 tiny uses the route groups in its cfg file, but the full v4 cfg does not? It looks like they're both CSP-based, but I'm not sure why only v4 tiny uses the route groups. Is it just to increase speed by cutting the feature size in half?
groups
and group_id
are implemented after cspnet is designed, so new csp models such as yolov4-tiny implement csp using route groups.
Interesting. So if YOLOv4 had not been published until later, it would have used the route groups? And I assume the functionality would have been the same as it is currently, just a different implementation? Thanks.
yes, both of two implementations are equivalent.
yolov4 is based on resnet, it split channel in base layer and remove bottleneck of res layers. so two path will be {2x, 2x}. yolov4-tiny is based on vovnet, if we split channel in base layer too, two path will be {1x, 3x}. so we split it in computational block to make it be {2x, 2x}. this modification is used for optimizing memory bandwidth.
if width height channel is w h c, the feature map will be fm[0:w-1, 0:h-1, 0:c-1]. groups = 2, group_id = 0 will gets fm[0:w-1, 0:h-1, 0:c/2-1]. groups = 2, group_id = 1 will gets fm[0:w-1, 0:h-1, c/2:c-1].
[route] layers=-1 groups=2 group_id=1
what doed the "groups=2 group_id=1" mean?