CVIU-CSU / HRDecoder

HRDecoder: High-Resolution Decoder Network for Fundus Image Lesion Segmentation
6 stars 1 forks source link

HRDecoder: High-Resolution Decoder Network for Fundus Image Lesion Segmentation

Environment

Our code is based on MMsegmentation.

In this work, we used:

The environment can be installed with:

conda create -n hrdecoder python=3.8.5
conda activate hrdecoder

pip install -r requirements.txt

Datasets

We carried out experiments on two public datasets, i.e. IDRiD and DDR. For IDRiD, please download the segmentation part from here. For DDR dataset, please download the segmentation part from here.

Please first change the original structure of datasets and organize like this:

IDRiD(segmentation part)
|——images
|  |——train
|  |  |——IDRiD_01.jpg
|  |  |——...
|  |  |——IDRiD_54.jpg
|  |——test
|——labels
|  |——train
|  |  |——EX
|  |  |  |——IDRiD_01_EX.tif
|  |  |  |——...
|  |  |  |——IDRiD_54_EX.tif
|  |  |——HE
|  |  |——SE
|  |  |——MA
|  |——test
|  |  |——EX
|  |  |——HE
|  |  |——SE
|  |  |——MA

DDR(segmentation part)
|——images
|  |——test
|  |  |——007-1789-100.jpg
|  |  |——...
|  |  |——20170627170651362.jpg
|  |——val
|  |——train
|——labels
|  |——test
|  |  |——EX
|  |  |  |——007-1789-100.tif
|  |  |  |——...
|  |  |  |——20170627170651362.tif
|  |  |——HE
|  |  |——MA
|  |  |——SE
|  |——val
|  |  |——EX
|  |  |——HE
|  |  |——MA
|  |  |——SE
|  |——train
|  |  |——EX
|  |  |——HE
|  |  |——MA
|  |  |——SE

Please note that the original GT with .tif format can not be directly used for training, so you can use the following command to convert the labels to .png format:

python tools/convert_dataset/idrid.py
python tools/convert_dataset/ddr.py

Then the structure of dataset will be converted to:

IDRiD(segmentation part)
|——images
|  |——train
|  |  |——IDRiD_01.jpg
|  |  |——...
|  |  |——IDRiD_54.jpg
|  |——test
|——labels
|  |——train
|  |  |——IDRiD_01.png
|  |  |——...
|  |  |——IDRiD_54.png
|  |——test

DDR(segmentation part)
|——images
|  |——test
|  |  |——007-1789-100.jpg
|  |  |——...
|  |  |——20170627170651362.jpg
|  |——val
|  |——train
|——labels
|  |——test
|  |  |——007-1789-100.png
|  |  |——...
|  |  |——20170627170651362.png
|  |——val
|  |——train

Finally the structure of the project should be like this:

HRDecoder
|——configs
|——mmseg
|——tools
|——data
|  |——IDRiD(segmentation part)
|  |  |——images
|  |  |  |——train
|  |  |  |——test
|  |  |——labels
|  |  |  |——train
|  |  |  |——test
|  |——DDR(segmentation part)
|  |  |——images
|  |  |  |——train
|  |  |  |——val
|  |  |  |——test
|  |  |——labels
|  |  |  |——train
|  |  |  |——val
|  |  |  |——test

Training and Testing

To train or test a model using MMsegmentation framework, you can use commands like this:

# train
CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=18940 tools/dist_train.sh configs/lesion/hrdecoder_fcn_hr48_idrid_2880x1920-slide.py 4
or
CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=18940 tools/dist_train.sh configs/lesion/efficient-hrdecoder_fcn_hr48_idrid_2880x1920-slide.py 4

# test
CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=17940 tools/dist_test.sh configs/lesion/hrdecoder_fcn_hr48_idrid_2880x1920-slide.py work_dirs/hrdecoder_fcn_hr48_idrid_2880x1920-slide/latest.pth 4 --eval mIoU
or
CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=17940 tools/dist_test.sh configs/lesion/efficient-hrdecoder_fcn_hr48_idrid_2880x1920-slide.py work_dirs/efficient-hrdecoder_fcn_hr48_idrid_2880x1920-slide/latest.pth 4 --eval mIoU

The trained model will be stored to work_dirs/.

In this project, we supply two version of implemention: HRDecoder and Efficient-HRDecoder.

HRDecoder is the version utilized for presenting the results in our paper. While Efficient-HRDecoder is an improved version of HRDecoder to further reduce computational overhead and memory usage. We simply compress the dimension of the extracted features using a 1x1 convolutional layer right after the backbone. This helps save a lot of memory. The compression operation can be replaced with various multi-scale feature fusion methods, such as FPN, CATNet, APPNet, CARAFE++ and so on. We do not focus on this part but on the later HR representation learning module and HR fusion module, so we simply use upsampling and concat in HRDecoder, or use a 1x1 conv in Efficient-HRDecoder. Our design is orthogonal to these methods, so it is easy to apply to other existing methods. HRDecoder can not only save costs but also enhance performance.

Acknowledgements

Last, we thank these authors for sharing their ideas and source code:

This manuscript was supported in part by the National Key Research and Development Program of China under Grant 2021YFF1201202, the Natural Science Foundation of Hunan Province under Grant 2024JJ5444 and 2023JJ30699, the Key Research and Development Program of Hunan Province under Grant 2023SK2029, the Changsha Municipal Natural Science Foundation under Grant kq2402227, and the Research Council of Finland (formerly Academy of Finland for Academy Research Fellow project) under Grant 355095. The authors wish to acknowledge High Performance Computing Center of Central South University for computational resources.