ViTAE-Transformer / SAMRS

The official repo for [NeurIPS'23] "SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model"
255 stars 13 forks source link
dataset deep-learning pre-training remote-sensing sam segment-anything-model semantic-segmentation transfer-learning

SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model

Di Wang, Jing Zhang, Bo Du, Minqiang Xu, Lin Liu, Dacheng Tao, Liangpei Zhang

News | Abstract | Usage | Results | Statement

News

2024.03.25

2023.12.07

2023.09.30

2023.09.26

2023.09.23

2023.09.22

2023.08.30

2023.06.14

2023.05.04

Other applications of ViTAE inlcude: VSA | ViTPose | Matting | Scene Text Spotting | Video Object Segmentation

Introduction

This is the official repository of the paper Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model

Figure 1: Some examples of SAM segmentation results on remote sensing images.

In this study, we leverage SAM and existing RS object detection datasets to develop an efficient pipeline for generating a large-scale RS segmentation dataset, dubbed SAMRS. SAMRS surpasses existing high-resolution RS segmentation datasets in size by several orders of magnitude, and provides object category, location, and instance information that can be used for semantic segmentation, instance segmentation, and object detection, either individually or in combination. We also provide a comprehensive analysis of SAMRS from various aspects. We hope it could facilitate research in RS segmentation, particularly in large model pre-training. # Usage - Please see in [[Generate Dataset](https://github.com/ViTAE-Transformer/SAMRS/tree/main/Generate%20Dataset)] for the codes of producing SAMRS dataset. - Please see in [[Pretraining and Finetuning](https://github.com/ViTAE-Transformer/SAMRS/tree/main/Pretraining%20and%20Finetuning)] for the codes of pretraining with SAMRS and fintuning on other datasets. # Results ## The basic information of generated datasets

Figure 2: Comparisons of different high-resolution RS segmentation datasets.

We present the comparison of our SAMRS dataset with existing high-resolution RS segmentation datasets in table. Based on the available high-resolution RSI object detection datasets, we can efficiently annotate 10,5090 images, which is more than ten times the capacity of existing datasets. Additionally, SAMRS inherits the categories of the original detection datasets, which makes them more diverse than other high-resolution RS segmentation collections. It is worth noting that RS object datasets usually have more diverse categories than RS segmentation datasets due to the difficulty of tagging pixels in RSIs, and thus our SAMRS reduces this gap. ## Visualization of Generated Masks

Figure 3: Some visual examples from the three subsets of our SAMRS dataset.

In figure, we visualize some segmentation annotations from the three subsets in our SAMRS dataset. As can be seen, SOTA exhibits a greater number of instances for tiny cars, whereas FAST provides a more fine-grained annotation of existing categories in SOTA such as car, ship, and plane. SIOR on the other hand, offers annotations for more diverse ground objects, such as *dam*. Hence, our SAMRS dataset encompasses a wide range of categories with varying sizes and distributions, thereby presenting a new challenge for RS semantic segmentation. ## Dataset Statistics and Analysis ### The class distribution.

Figure 4: Statistics of the number of pixels and instances for each category in the SAMRS database. The histograms for the subsets SOTA, SIOR, and FAST are shown in the first, second, and third columns, respectively. The first row presents histograms on a per-pixel basis, while the second row presents histograms on a per-instance basis.
### The mask size distribution.
Figure 5: Statistics of the mask sizes in different subsets of the SAMRS database. (a) SOTA. (b) SIOR. (c) FAST.
# Statement This project is for research purpose only. For any other questions please contact [d_wang@whu.edu.cn](mailto:d_wang@whu.edu.cn). ## Citation If you find SAMRS helpful, please consider giving this repo a ⭐ and citing: ``` @inproceedings{SAMRS, author = {Wang, Di and Zhang, Jing and Du, Bo and Xu, Minqiang and Liu, Lin and Tao, Dacheng and Zhang, Liangpei}, booktitle = {Advances in Neural Information Processing Systems}, pages = {8815--8827}, title = {SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model}, volume = {36}, year = {2023} } ``` ## Relevant Projects [1] An Empirical Study of Remote Sensing Pretraining, IEEE TGRS, 2022 | [Paper](https://ieeexplore.ieee.org/document/9782149) | [Github](https://github.com/ViTAE-Transformer/RSP)
     Di Wang, Jing Zhang, Bo Du, Gui-Song Xia and Dacheng Tao [2] Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model, IEEE TGRS, 2022 | [Paper](https://ieeexplore.ieee.org/document/9956816/) | [Github](https://github.com/ViTAE-Transformer/Remote-Sensing-RVSA)
     Di Wang, Qiming Zhang, Yufei Xu, Jing Zhang, Bo Du, Dacheng Tao and Liangpei Zhang