serdarch / SERNet-Former

[CVPR 2024 Workshops] SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks
Other
39 stars 3 forks source link

SERNet-Former

[![[CVPR 2024 Workshops] YouTube Video](https://img.shields.io/badge/CVPRW'24-YouTube-blue)](https://youtu.be/XXzMkotcdb4?feature=shared) [![CVPR 2024 Workshop](https://img.shields.io/badge/CVPR'24-Workshop-yellow)](https://equivision.github.io/index.html#papers) [![ArXiv paper](https://img.shields.io/badge/SERNetFormer-ArXiv-red)](https://doi.org/10.48550/arXiv.2401.15741) [![CVMI 2024](https://img.shields.io/badge/CVMI-2024-blue)](https://cvmi2024.iiita.ac.in/AcceptedPapers.php)

[CVPR 2024 Workshops] SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks

[CVMI 2024] SERNet-Former: Segmentation by Efficient-ResNet with Attention-Boosting Gates and Attention-Fusion Networks

Tutorials

Various implementations of SERNet-Former with different baselines for Multi-tasking is now online.

The example deploys ViT_h_14 baseline with 'Weights' 'IMAGENET1K_SWAG_E2E_V1' and simple U-Net decoder architecture. Open In Colab

Please also see the tutorials for

Image Segmentation based on DeepLabV3+_ResNet101 baseline Open In Colab

Image Classification based on ViT_h_14 baseline Open In Colab

News

Hall of Fame

PWC

PWC

PWC

PWC

PWC

SERNet-Former Conceptual

Efficient-ResNet

Figure1

(a) Attention-boosting Gate (AbG) and Attention-boosting Module (AbM) are fused into the encoder part.

(b) Attention-fusion Network (AfN), introduced into the decoder

Experiment Results

CamVid Dataset

The breakdown of class accuracies on CamVid dataset

Model Baseline Architecture Building Tree Sky Car Sign Road Pedestrian Fence Pole Sidewalk Bicycle mIoU
SERNet-Former Efficient-ResNet 93.0 88.8 95.1 91.9 73.9 97.7 76.4 83.4 57.3 90.3 83.1 84.62

The experiment outcomes on CamVid dataset

camvid_output

Cityscapes

Model Baseline Architecture road sidewalk building wall fence pole traffic light traffic sign vegetation terrain sky person rider car truck bus train motorcycle bicycle mIoU
SERNet-Former Efficient-ResNet 98.2 90.2 94.0 67.6 68.2 73.6 78.2 82.1 94.6 75.9 96.9 90.0 77.7 96.9 86.1 93.9 91.7 70.0 82.9 84.83

The experiment outcomes on Cityscapes dataset

cityscapes_output

Installation Support

You can simply download this repository into your environment by running

git clone https://github.com/serdarch/SERNet-Former.git

Citations

@article{Erisen2024SERNetFormer,
  title={SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks},
  author={Erişen, Serdar},
  journal={arXiv preprint arXiv:2401.15741},
  year={2024}
}

@inproceedings{Erisen2024CVPRW,
  title={SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks},
  author={Erişen, Serdar},
  booktitle={CVPRW},
  year={2024},
}