open-mmlab / mmsegmentation

OpenMMLab Semantic Segmentation Toolbox and Benchmark.
https://mmsegmentation.readthedocs.io/en/main/
Apache License 2.0
7.7k stars 2.53k forks source link

Dataset Converter script for ADE20k dataset #1930

Open aungpaing98 opened 1 year ago

aungpaing98 commented 1 year ago

For model training and testing, the accept label is in 1 dimension: [H, W]. But for ADE20k, in both old and new version, the label is in RGB format.

Also, I look up at the tools/convert_datasets/* there is no conversion script for ADE20k.

I found this issues, and following the same process as mentioned. where he use last dimension of the mask as the ground truth. To verify that, I tired running the test script, without any other modifications other than the above mentioned issues.

The weight file is downloaded from here from mmsegmentation model zoo.

python tools/test.py configs/swin/upernet_swin_tiny_patch4_window7_512x512_160k_ade20k_pretrain_224x224_1K.py pretrained/upernet_swin_tiny_patch4_window7_512x512_160k_ade20k_pretrain_224x224_1K_20210531_112542-e380ad3e.pth --eval mIoU --show

Below is the result.

aAcc mIoU mAcc
0.67 0.1 0.55

I am not sure if the result is good? The model predict mask is correct but, I am not sure about the alignment of the prediciton and the ground truth. Thanks all for your time.

MeowZheng commented 1 year ago

Here is a tutorial for dataset preparation https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#ade20k

aungpaing98 commented 1 year ago

Thanks for the reply.

Yes, I looked up the documentation. The issue is there is no conversion script for ADE20k dataset. And the dataset downloaded for label is in RGB image format. The dataset is download from opendatalab/ADE20k_2016

Changes that I made to the original mmsegmentation repository:

  1. Change ann_dir in config/_base_/datasets/ade20k.py from annotations/training to images/training
  2. Change seg_map_suffix='.png' to seg_map_suffix='_seg.png' in mmseg/datasets/ade.py. since the dataset only contains *_seg.png for label

And when I run the following command,

python tools/test.py configs/swin/upernet_swin_tiny_patch4_window7_512x512_160k_ade20k_pretrain_224x224_1K.py pretrained/upernet_swin_tiny_patch4_window7_512x512_160k_ade20k_pretrain_224x224_1K_20210531_112542-e380ad3e.pth --eval mIoU

the error occurs.

  File "/home/aung/Documents/Github/mmsegmentation/mmseg/core/evaluation/metrics.py", line 81, in intersect_and_union
    pred_label = pred_label[mask]
IndexError: too many indices for tensor of dimension 2

I suspect it is because the *_seg.png image is in RGB image. It would be great if there are script to convert the original RGB label to 1 channel label. Or, if you could provide ways to convert RGB to 1 channel label for ADE20k Dataset

Thanks for your time.