This repository maintains the official implementation of the paper Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images by Ye Liu, Huifang Li, Chao Hu, Shuang Luo, Yan Luo, and Chang Wen Chen, which has been accepted by TNNLS.
Please refer to the following environmental settings that we use. You may install these packages by yourself if you meet any problem during automatic installation.
git clone https://github.com/yeliudev/CATNet.git
cd CATNet
pip install -r requirements.txt
export PYTHONPATH=$PWD:$PYTHONPATH
CATNet
โโโ configs
โโโ datasets
โโโ models
โโโ tools
โโโ data
โ โโโ dior
โ โ โโโ Annotations
โ โ โโโ ImageSets
โ โ โโโ JPEGImages
โ โโโ hrsid
โ โ โโโ annotations
โ โ โโโ images
โ โโโ isaid
โ โ โโโ train
โ โ โโโ val
โ โ โโโ test
โ โโโ vhr
โ โโโ annotations
โ โโโ images
โโโ README.md
โโโ setup.cfg
โโโ ยทยทยท
Run the following command to train a model using a specified config.
mim train mmdet <path-to-config> --gpus 4 --launcher pytorch
If an
out-of-memory
error occurs on iSAID dataset, please uncomment L22-L24 in the dataset code and try again. This will filter out a few images with more than 1,000 objects, largely reducing the memory cost.
Run the following command to test a model and evaluate results.
mim test mmdet <path-to-config> --checkpoint <path-to-checkpoint> --gpus 4 --launcher pytorch
We provide multiple pre-trained models here. All the models are trained using 4 NVIDIA A100 GPUs and are evaluated using the default metrics of the datasets.
Dataset | Model | Backbone | Schd | Aug | Performance | Download | |
---|---|---|---|---|---|---|---|
BBox AP | Mask AP | ||||||
iSAID | CAT Mask R-CNN | ResNet-50 | 3x | ✗ | 45.1 | 37.2 | model | metrics |
CAT Mask R-CNN | ResNet-50 | 3x | ✓ | 47.7 | 39.2 | model | metrics | |
DIOR | CATNet | ResNet-50 | 3x | ✗ | 74.0 | โ | model | metrics |
CATNet | ResNet-50 | 3x | ✓ | 78.2 | โ | model | metrics | |
CAT R-CNN | ResNet-50 | 3x | ✗ | 75.8 | โ | model | metrics | |
CAT R-CNN | ResNet-50 | 3x | ✓ | 80.6 | โ | model | metrics | |
NWPU VHR-10 |
CAT Mask R-CNN | ResNet-50 | 6x | ✗ | 71.0 | 69.3 | model | metrics |
CAT Mask R-CNN | ResNet-50 | 6x | ✓ | 72.4 | 70.7 | model | metrics | |
HRSID | CAT Mask R-CNN | ResNet-50 | 6x | ✗ | 70.9 | 57.6 | model | metrics |
CAT Mask R-CNN | ResNet-50 | 6x | ✓ | 72.0 | 59.6 | model | metrics |
If you find this project useful for your research, please kindly cite our paper.
@article{liu2024learning,
title={Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images},
author={Liu, Ye and Li, Huifang and Hu, Chao and Luo, Shuang and Luo, Yan and Chen, Chang Wen},
journal={IEEE Transactions on Neural Networks and Learning Systems},
year={2024}
}