FcaNet: Frequency Channel Attention Networks

PyTorch implementation of the paper "FcaNet: Frequency Channel Attention Networks".

alt text

Simplest usage

Models pretrained on ImageNet can be simply accessed by (without any configuration or installation):

model = torch.hub.load('cfzd/FcaNet', 'fca34' ,pretrained=True)
model = torch.hub.load('cfzd/FcaNet', 'fca50' ,pretrained=True)
model = torch.hub.load('cfzd/FcaNet', 'fca101' ,pretrained=True)
model = torch.hub.load('cfzd/FcaNet', 'fca152' ,pretrained=True)

Install

Please see INSTALL.md

Models

Classification models on ImageNet

Due to the conversion between FP16 training and the provided FP32 models, the evaluation results are slightly different(max -0.06%/+0.05%) compared with the reported results.	Model	Reported	Evaluation Results
FcaNet34	75.07	75.02	GoogleDrive/BaiduDrive(code:m7v8)
FcaNet50	78.52	78.57	GoogleDrive/BaiduDrive(code:mgkk)
FcaNet101	79.64	79.63	GoogleDrive/BaiduDrive(code:8t0j)
FcaNet152	80.08	80.02	GoogleDrive/BaiduDrive(code:5yeq)

Detection and instance segmentation models on COCO

Model	Backbone	AP	AP50	AP75	Link
Faster RCNN	FcaNet50	39.0	61.1	42.3	GoogleDrive/BaiduDrive(code:q15c)
Faster RCNN	FcaNet101	41.2	63.3	44.6	GoogleDrive/BaiduDrive(code:pgnx)
Mask RCNN	Fca50 det Fca50 seg	40.3 36.2	62.0 58.6	44.1 38.1	GoogleDrive/BaiduDrive(code:d9rn)

Training

Please see launch_training_classification.sh and launch_training_detection.sh for training on ImageNet and COCO, respectively.

Testing

Please see launch_eval_classification.sh and launch_eval_detection.sh for testing on ImageNet and COCO, respectively.

FAQ

Since the paper is uploaded to arxiv, many academic peers ask us: the proposed DCT basis can be viewed as a simple tensor, then how about learning the tensor directly? Why use DCT instead of learnable tensor? Learnable tensor can be better than DCT.

Our concrete answer is: the proposed DCT is better than the learnable way, although it is counter-intuitive.

Method	ImageNet Top-1 Acc	Link
Learnable tensor, random initialization	77.914	GoogleDrive/BaiduDrive(code:p2hl)
Learnable tensor, DCT initialization	78.352	GoogleDrive/BaiduDrive(code:txje)
Fixed tensor, random initialization	77.742	GoogleDrive/BaiduDrive(code:g5t9)
Fixed tensor, DCT initialization (Ours)	78.574	GoogleDrive/BaiduDrive(code:mgkk)

To verify this results, one can select the cooresponding types of tensor in the L73-L83 in model/layer.py, uncomment it and train the whole network.

TODO

[x] Object detection models
[x] Instance segmentation models
[x] Fix the incorrect results of detection models
[ ] Make the switching between configs more easier

cfzd / FcaNet

readme