Shuai Yuan, Hanlin Qin, Xiang Yan, Naveed Akhtar, Aimal Main, IEEE Transactions on Geoscience and Remote Sensing 2024.
https://www.bilibili.com/video/BV1kr421M7wx/
https://mp.weixin.qq.com/s/H7KLmtFX7j09f-Xc6X1FRw
If the implementation of this repo is helpful to you, just star it!⭐⭐⭐
We present a Spatial-channel Cross Transformer Network (SCTransNet) to the IRSTD task. Experiments on both public (e.g., SIRST, NUDT-SIRST, IRSTD-1K) demonstrate the effectiveness of our method. Our main contributions are as follows:
We propose SCTransNet, leveraging spatial-channel cross transformer blocks (SCTB) to predict the context of targets and backgrounds in the deeper network layers.
A spatial-embedded single-head channel-cross attention (SSCA) module is utilized to foster semantic interactions across all feature levels and learn the long-range context.
We devise a novel complementary feed-forward network (CFN) by crossing spatial-channel information to enhance the semantic difference between the target and background.
Our project has the following structure:
├──./datasets/
│ ├── IRSTD-1K
│ │ ├── images
│ │ │ ├── XDU0.png
│ │ │ ├── XDU1.png
│ │ │ ├── ...
│ │ ├── masks
│ │ │ ├── XDU0.png
│ │ │ ├── XDU1.png
│ │ │ ├── ...
│ │ ├── img_idx
│ │ │ ├── train_IRSTD-1K.txt
│ │ │ ├── test_IRSTD-1K.txt
│ ├── NUDT-SIRST
│ │ ├── images
│ │ │ ├── 000001.png
│ │ │ ├── 000002.png
│ │ │ ├── ...
│ │ ├── masks
│ │ │ ├── 000001.png
│ │ │ ├── 000002.png
│ │ │ ├── ...
│ │ ├── img_idx
│ │ │ ├── train_NUDT-SIRST.txt
│ │ │ ├── test_NUDT-SIRST.txt
│ ├── ...
│ ├── ...
│ ├── SIRST3
│ │ ├── images
│ │ │ ├── XDU0.png
│ │ │ ├── XDU1.png
│ │ │ ├── ...
│ │ ├── masks
│ │ │ ├── XDU0.png
│ │ │ ├── XDU1.png
│ │ │ ├── ...
│ │ ├── img_idx
│ │ │ ├── train_SIRST3.txt
│ │ │ ├── test_SIRST3.txt
python train.py
python test.py
Model | mIoU (x10(-2)) | nIoU (x10(-2)) | F-measure (x10(-2)) | Pd (x10(-2)) | Fa (x10(-6)) |
---|---|---|---|---|---|
SIRST | 77.50 | 81.08 | 87.32 | 96.95 | 13.92 |
NUDT-SIRST | 94.09 | 94.38 | 96.95 | 98.62 | 4.29 |
IRSTD-1K | 68.03 | 68.15 | 80.96 | 93.27 | 10.74 |
[Weights] |
*This code is highly borrowed from IRSTD-Toolbox. Thanks to Xinyi Ying.
*This code is highly borrowed from UCTransNet. Thanks to Haonan Wang.
*The overall repository style is highly borrowed from DNA-Net. Thanks to Boyang Li.
If you find the code useful, please consider citing our paper using the following BibTeX entry.
@ARTICLE{10486932,
author={Yuan, Shuai and Qin, Hanlin and Yan, Xiang and Akhtar, Naveed and Mian, Ajmal},
journal={IEEE Transactions on Geoscience and Remote Sensing},
title={SCTransNet: Spatial-Channel Cross Transformer Network for Infrared Small Target Detection},
year={2024},
volume={62},
number={},
pages={1-15},
keywords={Semantics;Transformers;Decoding;Feature extraction;Task analysis;Object detection;Visualization;Convolutional neural network (CNN);cross-attention;deep learning;infrared small target detection (IRSTD);transformer},
doi={10.1109/TGRS.2024.3383649}}
Welcome to raise issues or email to yuansy@stu.xidian.edu.cn or yuansy2@student.unimelb.edu.au for any question regarding our SCTransNet.