Update (Aug 17, 2021): refactored the code of ACB. The readability has been greatly improved. You may call switch_to_deploy of an ACB to convert it to the inference-time structure. If you use ACB in your own model, the conversion is as easy as
for m in your_model.modules():
if hasattr(m, 'switch_to_deploy'):
m.switch_to_deploy()
There is also some runnable code for testing the equivalence in the main function of acnet/acb.py. Just check it by
python acnet/acb.py
ACNet v2 (Diverse Branch Block, DBB): Diverse Branch Block: Building a Convolution as an Inception-like Unit.
DBB (CVPR 2021) is a CNN component with higher performance than ACB and still no inference-time costs. Sometimes I call it ACNet v2 because "DBB" is 2 bits larger than "ACB" in ASCII (lol).
I would suggest you check the repo of DBB (https://github.com/DingXiaoH/DiverseBranchBlock). It also has an implementation of ACNet.
News:
Update: Updated the whole repo, including ImageNet training (with Distributed Data Parallel). The default learning rate schedules were changed to cosine annealing, which performed better on ImageNet. Changed the behavior of ACB when k > 3. It used to add 1x3 and 3x1 kernels onto 5x5, but now it uses 1x5 and 5x1.
ICCV 2019 paper: ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks.
Other implementations:
This demo will show you how to
About the environment:
Some results (Top-1 accuracy) reproduced on CIFAR-10 using the codes in this repository (note that we add batch norm for Cifar-quick and VGG baselines):
Model | Baseline | ACNet |
---|---|---|
Cifar-quick | 86.20 | 86.87 |
VGG | 93.99 | 94.54 |
ResNet-56 | 94.55 | 95.06 |
WRN-16-8 | 95.89 | 96.33 |
If it does not work on your specific model and dataset, based on my experience, I would suggest you
The experiments reported in the paper were performed using Tensorflow. However, the backbone of the codes was refactored from the official Tensorflow benchmark (https://github.com/tensorflow/benchmarks/tree/master/scripts/tf_cnn_benchmarks), which was designed in the pursuit of extreme speed, not readability.
Citation:
@InProceedings{Ding_2019_ICCV,
author = {Ding, Xiaohan and Guo, Yuchen and Ding, Guiguang and Han, Jungong},
title = {ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}
}
As designing appropriate Convolutional Neural Network (CNN) architecture in the context of a given application usually involves heavy human works or numerous GPU hours, the research community is soliciting the architecture-neutral CNN structures, which can be easily plugged into multiple mature architectures to improve the performance on our real-world applications. We propose Asymmetric Convolution Block (ACB), an architecture-neutral structure as a CNN building block, which uses 1D asymmetric convolutions to strengthen the square convolution kernels. For an off-the-shelf architecture, we replace the standard square-kernel convolutional layers with ACBs to construct an Asymmetric Convolutional Network (ACNet), which can be trained to reach a higher level of accuracy. After training, we equivalently convert the ACNet into the same original architecture, thus requiring no extra computations anymore. We have observed that ACNet can improve the performance of various models on CIFAR and ImageNet by a clear margin. Through further experiments, we attribute the effectiveness of ACB to its capability of enhancing the model's robustness to rotational distortions and strengthening the central skeleton parts of square convolution kernels.
Enter this directory.
Make a soft link to your ImageNet directory, which contains "train" and "val" directories.
ln -s YOUR_PATH_TO_IMAGENET imagenet_data
Set the environment variables. We use 8 GPUs with Distributed Data Parallel. Of course, 4 GPUs work as well.
export PYTHONPATH=.
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
Train a ResNet-18 on ImageNet with Asymmetric Convolution Blocks. The code will automatically convert the trained weights to the original structure and test. The top-1 accuracy will be around 71.2%.
python -m torch.distributed.launch --nproc_per_node=8 acnet/do_acnet.py -a sres18 -b acb
Check the shape of weights in the converted model.
python3 display_hdf5.py acnet_exps/sres18_acb_train/finish_deploy.hdf5
Train a regular ResNet-18 on ImageNet as baseline for the comparison. The top-1 accuracy will be around 70.6%.
python -m torch.distributed.launch --nproc_per_node=8 acnet/do_acnet.py -a sres18 -b base
ResNet-34 and ResNet-50 are also provided in acnet/do_acnet.py, please try as you wish.
python -m torch.distributed.launch --nproc_per_node=8 acnet/do_acnet.py -a sres34 -b acb
python -m torch.distributed.launch --nproc_per_node=8 acnet/do_acnet.py -a sres50 -b acb
Enter this directory.
Make a soft link to your CIFAR-10 directory. If the dataset is not found in the directory, it will be automatically downloaded.
ln -s YOUR_PATH_TO_CIFAR cifar10_data
Set the environment variables.
export PYTHONPATH=.
export CUDA_VISIBLE_DEVICES=0
Train the Cifar-quick ACNet. The code will automatically convert the trained weights to the original structure (acnet_exps/cfqkbnc_acb_train/finish_deploy.hdf5) and test. Then train a regular model as baseline for the comparison.
python acnet/do_acnet.py -a cfqkbnc -b acb
python acnet/do_acnet.py -a cfqkbnc -b base
Do the same on VGG.
python acnet/do_acnet.py -a vc -b acb
python acnet/do_acnet.py -a vc -b base
Do the same on ResNet-56.
python acnet/do_acnet.py -a src56 -b acb
python acnet/do_acnet.py -a src56 -b base
Do the same on WRN-16-8.
python acnet/do_acnet.py -a wrnc16plain -b acb
python acnet/do_acnet.py -a wrnc16plain -b base
Show the accuracy of all the models.
python show_log.py acnet_exps
xiaohding@gmail.com (The original Tsinghua mailbox dxh17@mails.tsinghua.edu.cn will expire in several months)
Google Scholar Profile: https://scholar.google.com/citations?user=CIjw0KoAAAAJ&hl=en
Homepage: https://dingxiaohan.xyz/
My open-sourced papers and repos:
The Structural Re-parameterization Universe:
RepLKNet (CVPR 2022) Powerful efficient architecture with very large kernels (31x31) and guidelines for using large kernels in model CNNs\ Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs\ code.
RepOptimizer (ICLR 2023) uses Gradient Re-parameterization to train powerful models efficiently. The training-time RepOpt-VGG is as simple as the inference-time. It also addresses the problem of quantization.\ Re-parameterizing Your Optimizers rather than Architectures\ code.
RepVGG (CVPR 2021) A super simple and powerful VGG-style ConvNet architecture. Up to 84.16% ImageNet top-1 accuracy!\ RepVGG: Making VGG-style ConvNets Great Again\ code.
RepMLP (CVPR 2022) MLP-style building block and Architecture\ RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality\ code.
ResRep (ICCV 2021) State-of-the-art channel pruning (Res50, 55\% FLOPs reduction, 76.15\% acc)\ ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting\ code.
ACB (ICCV 2019) is a CNN component without any inference-time costs. The first work of our Structural Re-parameterization Universe.\ ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks.\ code.
DBB (CVPR 2021) is a CNN component with higher performance than ACB and still no inference-time costs. Sometimes I call it ACNet v2 because "DBB" is 2 bits larger than "ACB" in ASCII (lol).\ Diverse Branch Block: Building a Convolution as an Inception-like Unit\ code.
Model compression and acceleration:
(CVPR 2019) Channel pruning: Centripetal SGD for Pruning Very Deep Convolutional Networks with Complicated Structure\ code
(ICML 2019) Channel pruning: Approximated Oracle Filter Pruning for Destructive CNN Width Optimization\ code
(NeurIPS 2019) Unstructured pruning: Global Sparse Momentum SGD for Pruning Very Deep Neural Networks\ code