Junjue-Wang / EarthVQA

[AAAI 2024] EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering
81 stars 2 forks source link

EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering

by Junjue Wang, Zhuo Zheng, Zihang Chen, Ailong Ma, and Yanfei Zhong

[Paper], [Video], [Dataset], [Leaderboard-SEG], [Leaderboard-VQA]

News

Requirements:

Install Ever + Segmentation Models PyTorch

pip install ever-beta
pip install git+https://github.com/qubvel/segmentation_models.pytorch
pip install albumentations==1.4.3 # This version is important for our repo.

Data preparation

# 1. generate semantic masks use the pre-trained SFPN weight
sh ./scripts/generate_segfeats.sh
# 2. generate answers use the pre-trained SOBA weight
sh ./scripts/predict_soba.sh

Train

# 1 train a segmentation model
sh ./scripts/train_sfpnr50.sh
# 2 generate segmentation features and pse-masks
sh ./scripts/generate_segfeats.sh
# 3 train SOBA
sh ./scripts/train_soba.sh

Citation

If you use EarthVQA in your research, please cite our following papers.

    @article{wang2024earthvqa, 
        title={EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering},
        url={https://ojs.aaai.org/index.php/AAAI/article/view/28357}, 
        DOI={10.1609/ai.v38i6.28357}, 
        author={Junjue Wang and Zhuo Zheng and Zihang Chen and Ailong Ai and Yanfei Zhong}, 
        year={2024}, 
        month={Mar.},
        volume={38},
        pages={5481-5489}}
    @dataset{junjue_wang_2021_5706578,
        author={Junjue Wang and Zhuo Zheng and Ailong Ma and Xiaoyan Lu and Yanfei Zhong},
        title={Love{DA}: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation},
        month=oct,
        year=2021,
        publisher={Zenodo},
        doi={10.5281/zenodo.5706578},
        url={https://doi.org/10.5281/zenodo.5706578}}

Dataset and Contest

The EarthVQA dataset is released at Google Drive and Baidu Drive

You can develop your models on Train and Validation sets.

Semantic Category labels: background – 1, building – 2, road – 3, water – 4, barren – 5,forest – 6, agriculture – 7, playground - 8. And the no-data regions were assigned 0 which should be ignored. The provided data loader will help you construct your pipeline.

Submit your test results on EarthVQA Semantic Segmentation Challenge, EarthVQA Visual Question Answering Challenge. You will get your Test scores smoothly.

Feel free to design your own models, and we are looking forward to your exciting results!

License

The owners of the data and of the copyright on the data are RSIDEA, Wuhan University. Use of the Google Earth images must respect the "Google Earth" terms of use. All images and their associated annotations in EarthVQA can be used for academic purposes only, but any commercial use is prohibited.

知识共享许可协议