inverted residual block with residuals: low dimension input => 1x1 convolution to expand to high dimension => 3x3 depth wise convolution => 1x1 convolution to project back to low dimension; shortcut connection between input and output
depthwise convolution: first a convolution per channel, then use 1x1 convolution to combine the channels
"linear bottleneck": (attach figure 3b) no relu on the bottleneck layer
Architecture: DenseBlock => within each DenseBlock, n DenseLayers that connect to every following layer, by accumulating their output
class DenseBlock(nn.Module):
def forward(self, init_features):
features = [init_features]
for name, layer in self.named_children(): # layers
new_features = layer(*features)
features.append(new_features) # concat
return torch.cat(features, 1)
Each DenseLayer is an inverted bottleneck similar to MobileNet: 1x1 conv to expand to high dimension => 3x3 conv back; could probably use depth wise convolution to save some parameters
Transition layer between DenseBlocks for downsampling: 1x1 conv to downsample #filters, than 2x2 average pooling to downsample xy dimension
MobileNetV2
Summary
Useful reference (page 2):
DenseNet
Summary
Architecture:
DenseBlock
=> within eachDenseBlock
, nDenseLayers
that connect to every following layer, by accumulating their outputDenseLayer
is an inverted bottleneck similar to MobileNet: 1x1 conv to expand to high dimension => 3x3 conv back; could probably use depth wise convolution to save some parameters