ViTAE-Transformer / MTP

The official repo for [JSTARS'24] "MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining"
MIT License
125 stars 7 forks source link

Test Question #16

Closed chartgod closed 3 weeks ago

chartgod commented 4 weeks ago

Hello. I used the pretrained model levir-rvsa-l-mae-mtp-epoch_150.pth on the LEVIR-CD dataset to test the LEVIR-CD dataset test data. However, the results were very low, and I would like to understand what went wrong.

Due to tensor input errors, I resized the LEVIR-CD dataset to 256 before testing. The performance results are as follows:

06/19 13:54:35 - mmengine - INFO - per class results: 06/19 13:54:35 - mmengine - INFO - +-----------+--------+-----------+--------+-------+-------+ | Class | Fscore | Precision | Recall | IoU | Acc | +-----------+--------+-----------+--------+-------+-------+ | unchanged | 97.61 | 95.59 | 99.73 | 95.34 | 99.73 | | changed | 23.47 | 73.48 | 13.97 | 13.3 | 13.97 | +-----------+--------+-----------+--------+-------+-------+ 06/19 13:54:35 - mmengine - INFO - Epoch(test) [128/128] aAcc: 95.3700 mFscore: 60.5400 mPrecision: 84.5300 mRecall: 56.8500 mIoU: 54.3200 mAcc: 56.8500 data_time: 0.0341 time: 0.1686

The performance for the "changed" class is very low. Why is this happening?

chartgod commented 3 weeks ago

CUDA_VISIBLE_DEVICES=0,1,2,3 python -u /home/lsh/share/mmsatellite/test.py /home/lsh/share/MTP/1/rvsa-l-unet-256-mae-mtp_levir.py /home/lsh/share/mmsatellite/pretrained_model/levir-rvsa-l-mae-mtp-epoch_150.pth --work-dir=/home/lsh/share/mmsatellite/data/test/test/predict --show-dir=/home/lsh/share/mmsatellite/data/test/test/predict/1 --cfg-options val_cfg=None val_dataloader=None val_evaluator=None

The results obtained by using the above command, but could you check if there are any errors?

DotWang commented 3 weeks ago

@chartgod we use opencd for change detection

chartgod commented 3 weeks ago

The file name is 'mmsatellite', but I used the OpenCD change detection code

DotWang commented 3 weeks ago

@chartgod For the LEVIR dataset, we usually crop it to non-overlapping 256*256 instead of resizing. If you want to process the original large image, it is suggested to change the mode of test_cfg from whole to slide (I'm not sure whether Open-CD can)

you can refer to rvsa-l-upernet-512-mae-mtp-loveda.py

chartgod commented 3 weeks ago

My config file crop_size = ( 256, 256, ) data_preprocessor = dict( bgr_to_rgb=True, mean=[ 123.675, 116.28, 103.53, 123.675, 116.28, 103.53, ], pad_val=0, seg_pad_val=255, size_divisor=32, std=[ 58.395, 57.12, 57.375, 58.395, 57.12, 57.375, ], test_cfg=dict(size_divisor=32), type='DualInputSegDataPreProcessor') data_root = '/home/lsh/share/data/LEVIR_CD' dataset_type = 'LEVIR_CD_Dataset' default_hooks = dict( checkpoint=dict( by_epoch=True, interval=30, save_best='mIoU', type='CheckpointHook'), logger=dict(interval=50, log_metric_by_epoch=True, type='LoggerHook'), param_scheduler=dict(type='ParamSchedulerHook'), sampler_seed=dict(type='DistSamplerSeedHook'), timer=dict(type='IterTimerHook'), visualization=dict( draw=True, img_shape=( 256, 256, 3, ), interval=1, type='CDVisualizationHook')) default_scope = 'opencd' env_cfg = dict( cudnn_benchmark=True, dist_cfg=dict(backend='nccl'), mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0)) launcher = 'none' load_from = '/home/lsh/share/mmsatellite/pretrained_model/levir-rvsa-l-mae-mtp-epoch_150.pth' log_level = 'INFO' log_processor = dict(by_epoch=True, type='LogProcessor', window_size=50) model = dict( backbone=dict( attn_drop_rate=0.0, depth=24, drop_path_rate=0.3, drop_rate=0.0, embed_dim=1024, frozen_stages=-1, img_size=256, interval=6, mlp_ratio=4, num_heads=16, out_indices=[ 7, 11, 15, 23, ], patch_size=16, pretrained= '/home/lsh/share/mmsatellite/pretrained_model/levir-rvsa-l-mae-mtp-epoch_150.pth', qk_scale=None, qkv_bias=True, type='RVSA_MTP', use_abs_pos_emb=True, use_checkpoint=False), data_preprocessor=dict( bgr_to_rgb=True, mean=[ 123.675, 116.28, 103.53, 123.675, 116.28, 103.53, ], pad_val=0, seg_pad_val=255, size_divisor=32, std=[ 58.395, 57.12, 57.375, 58.395, 57.12, 57.375, ], test_cfg=dict(size_divisor=32), type='DualInputSegDataPreProcessor'), decode_head=dict( align_corners=False, attention_type=None, center=False, channels=64, decoder_channels=[ 512, 256, 128, 64, ], dropout_ratio=0.1, encoder_channels=[ 1024, 1024, 1024, 1024, ], ignore_index=255, in_channels=[ 1024, 1024, 1024, 1024, ], in_index=[ 0, 1, 2, 3, ], loss_decode=dict( loss_weight=1.0, type='mmseg.CrossEntropyLoss', use_sigmoid=False), n_blocks=4, norm_cfg=dict(requires_grad=True, type='SyncBN'), num_classes=2, type='UNetHead', use_batchnorm=True), neck=dict( out_indices=( 0, 1, 2, 3, ), policy='abs_diff', type='FeatureFusionNeck'), test_cfg=dict(mode='whole'), train_cfg=dict(), type='SiamEncoderDecoder') norm_cfg = dict(requires_grad=True, type='SyncBN') optim_wrapper = None param_scheduler = None resume = False test_cfg = dict(type='TestLoop') test_dataloader = dict( batch_size=1, dataset=dict( data_prefix=dict( img_path_from='test/A', img_path_to='test/B', seg_map_path='test/label'), data_root='/home/lsh/share/data/LEVIR_CD', pipeline=[ dict(type='MultiImgLoadImageFromFile'), dict(keep_ratio=True, scale=( 256, 256, ), type='MultiImgResize'), dict(type='MultiImgLoadAnnotations'), dict(type='MultiImgPackSegInputs'), ], type='LEVIR_CD_Dataset'), num_workers=8, persistent_workers=True, sampler=dict(shuffle=False, type='DefaultSampler')) test_evaluator = dict( iou_metrics=[ 'mFscore', 'mIoU', ], type='mmseg.IoUMetric') test_pipeline = [ dict(type='MultiImgLoadImageFromFile'), dict(keep_ratio=True, scale=( 256, 256, ), type='MultiImgResize'), dict(type='MultiImgLoadAnnotations'), dict(type='MultiImgPackSegInputs'), ] train_cfg = None train_dataloader = None val_cfg = None val_dataloader = None val_evaluator = None val_pipeline = [ dict(type='MultiImgLoadImageFromFile'), dict(keep_ratio=True, scale=( 256, 256, ), type='MultiImgResize'), dict(type='MultiImgLoadAnnotations'), dict(type='MultiImgPackSegInputs'), ] vis_backends = [ dict(type='CDLocalVisBackend'), ] visualizer = dict( alpha=1.0, name='visualizer', save_dir='/home/lsh/share/mmsatellite/data/test/test/predict/2', type='CDLocalVisualizer', vis_backends=[ dict(type='CDLocalVisBackend'), ]) work_dir = '/home/lsh/share/mmsatellite/data/test/test/predict'

06/20 16:17:44 - mmengine - WARNING - The prefix is not set in metric class IoUMetric. Loads checkpoint by local backend from path: /home/lsh/share/mmsatellite/pretrained_model/levir-rvsa-l-mae-mtp-epoch_150.pth 06/20 16:17:47 - mmengine - INFO - Load checkpoint from /home/lsh/share/mmsatellite/pretrained_model/levir-rvsa-l-mae-mtp-epoch_150.pth 06/20 16:18:00 - mmengine - INFO - Epoch(test) [ 50/128] eta: 0:00:20 time: 0.2683 data_time: 0.0519 memory: 4641 06/20 16:18:10 - mmengine - INFO - Epoch(test) [100/128] eta: 0:00:06 time: 0.1876 data_time: 0.0217 memory: 1545 06/20 16:18:15 - mmengine - INFO - per class results: 06/20 16:18:15 - mmengine - INFO - +-----------+--------+-----------+--------+-------+-------+ | Class | Fscore | Precision | Recall | IoU | Acc | +-----------+--------+-----------+--------+-------+-------+ | unchanged | 97.61 | 95.59 | 99.73 | 95.34 | 99.73 | | changed | 23.47 | 73.48 | 13.97 | 13.3 | 13.97 | +-----------+--------+-----------+--------+-------+-------+ 06/20 16:18:15 - mmengine - INFO - Epoch(test) [128/128] aAcc: 95.3700 mFscore: 60.5400 mPrecision: 84.5300 mRecall: 56.8500 mIoU: 54.3200 mAcc: 56.8500 data_time: 0.0335 time: 0.2192

chartgod commented 3 weeks ago

CUDA_VISIBLE_DEVICES=0,1,2,3 python -u /home/lsh/share/mmsatellite/test.py /home/lsh/share/MTP/1/rvsa-l-unet-256-mae-mtp_levir.py /home/lsh/share/mmsatellite/pretrained_model/levir-rvsa-l-mae-mtp-epoch_150.pth --work-dir=/home/lsh/share/mmsatellite/data/test/test/predict --show-dir=/home/lsh/share/mmsatellite/data/test/test/predict/3 --cfg-options val_cfg=None val_dataloader=None val_evaluator=None train_cfg=None train_dataloader=None optim_wrapper=None param_scheduler=None

DotWang commented 3 weeks ago

@chartgod I have said, please manually clip the dataset to 256*256, as the paper mentioned.

chartgod commented 3 weeks ago

Hello. I'm a beginner at openmmlab, but the source code below is cd_loca_visualizer.py file, so I don't know why it keeps getting this error.

from typing import Optional, Sequence

import mmcv import numpy as np from mmengine.dist import master_only

from mmseg.structures import SegDataSample from mmseg.visualization import SegLocalVisualizer from opencd.registry import VISUALIZERS

@VISUALIZERS.register_module() class CDLocalVisualizer(SegLocalVisualizer): """Change Detection Local Visualizer. """

@master_only
def add_datasample(
        self,
        name: str,
        image: np.ndarray,
        image_from_to: Sequence[np.array],
        data_sample: Optional[SegDataSample] = None,
        draw_gt: bool = True,
        draw_pred: bool = True,
        show: bool = False,
        wait_time: float = 0,
        # TODO: Supported in mmengine's Viusalizer.
        out_file: Optional[str] = None,
        step: int = 0,
        with_labels: Optional[bool] = False) -> None:
    """Draw datasample and save to all backends.

    - If GT and prediction are plotted at the same time, they are
    displayed in a stitched image where the left image is the
    ground truth and the right image is the prediction.
    - If ``show`` is True, all storage backends are ignored, and
    the images will be displayed in a local window.
    - If ``out_file`` is specified, the drawn image will be
    saved to ``out_file``. it is usually used when the display
    is not available.

    Args:
        name (str): The image identifier.
        image (np.ndarray): The image to draw.
        image_from_to (Sequence[np.array]): The image pairs to draw.
        gt_sample (:obj:`SegDataSample`, optional): GT SegDataSample.
            Defaults to None.
        pred_sample (:obj:`SegDataSample`, optional): Prediction
            SegDataSample. Defaults to None.
        draw_gt (bool): Whether to draw GT SegDataSample. Default to True.
        draw_pred (bool): Whether to draw Prediction SegDataSample.
            Defaults to True.
        show (bool): Whether to display the drawn image. Default to False.
        wait_time (float): The interval of show (s). Defaults to 0.
        out_file (str): Path to output file. Defaults to None.
        step (int): Global step value to record. Defaults to 0.
        with_labels(bool, optional): Add semantic labels in visualization
            result, Defaults to True.
    """
    exist_img_from_to = True if len(image_from_to) > 0 else False
    if exist_img_from_to:
        assert len(image_from_to) == 2, '`image_from_to` contains `from` ' \
            'and `to` images'

    classes = self.dataset_meta.get('classes', None)
    palette = self.dataset_meta.get('palette', None)
    semantic_classes = self.dataset_meta.get('semantic_classes', None)
    semantic_palette = self.dataset_meta.get('semant
chartgod commented 3 weeks ago

my test.py import argparse import os import os.path as osp

from mmengine.config import Config, DictAction from mmengine.runner import Runner

def parse_args(): parser = argparse.ArgumentParser( description='Open-CD test (and eval) a model') parser.add_argument('config', help='train config file path') parser.add_argument('checkpoint', help='checkpoint file') parser.add_argument( '--work-dir', help=('if specified, the evaluation metric results will be dumped' 'into the directory as json')) parser.add_argument( '--show', action='store_true', help='show prediction results') parser.add_argument( '--show-dir', help='directory where painted images will be saved. ' 'If specified, it will be automatically saved ' 'to the work_dir/timestamp/show_dir') parser.add_argument( '--wait-time', type=float, default=2, help='the interval of show (s)') parser.add_argument( '--cfg-options', nargs='+', action=DictAction, help='override some settings in the used config, the key-value pair ' 'in xxx=yyy format will be merged into config file. If the value to ' 'be overwritten is a list, it should be like key="[a,b]" or key=a,b ' 'It also allows nested list/tuple values, e.g. key="[(a,b),(c,d)]" ' 'Note that the quotation marks are necessary and that no white space ' 'is allowed.') parser.add_argument( '--launcher', choices=['none', 'pytorch', 'slurm', 'mpi'], default='none', help='job launcher') parser.add_argument( '--tta', action='store_true', help='Test time augmentation') parser.add_argument('--local_rank', '--local-rank', type=int, default=0) args = parser.parse_args() if 'LOCAL_RANK' not in os.environ: os.environ['LOCAL_RANK'] = str(args.local_rank)

return args

def trigger_visualization_hook(cfg, args): default_hooks = cfg.default_hooks if 'visualization' in default_hooks: visualization_hook = default_hooks['visualization']

Turn on visualization

    visualization_hook['draw'] = True
    if args.show:
        visualization_hook['show'] = True
        visualization_hook['wait_time'] = args.wait_time
    if args.show_dir:
        visulizer = cfg.visualizer
        visulizer['save_dir'] = args.show_dir
else:
    raise RuntimeError(
        'VisualizationHook must be included in default_hooks.'
        'refer to usage '
        '"visualization=dict(type=\'VisualizationHook\')"')

return cfg

def main(): args = parse_args()

# load config
cfg = Config.fromfile(args.config)
cfg.launcher = args.launcher
if args.cfg_options is not None:
    cfg.merge_from_dict(args.cfg_options)

if args.work_dir is not None:

    cfg.work_dir = args.work_dir
elif cfg.get('work_dir', None) is None:

    cfg.work_dir = osp.join('./work_dirs',
                            osp.splitext(osp.basename(args.config))[0])

cfg.load_from = args.checkpoint

if args.show or args.show_dir:
    cfg = trigger_visualization_hook(cfg, args)

if args.tta:
    cfg.test_dataloader.dataset.pipeline = cfg.tta_pipeline
    cfg.tta_model.module = cfg.model
    cfg.model = cfg.tta_model

runner = Runner.from_cfg(cfg)

runner.test()

if name == 'main': main()

chartgod commented 3 weeks ago

my error

File "/home/lsh/miniconda3/envs/mtp/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1839, in call_hook getattr(hook, fn_name)(self, *kwargs) File "/home/lsh/miniconda3/envs/mtp/lib/python3.8/site-packages/mmengine/hooks/hook.py", line 277, in after_test_iter self._after_iter( File "/home/lsh/share/mmsatellite/opencd/engine/hooks/visualization_hook.py", line 106, in _after_iter self._visualizer.add_datasample( File "/home/lsh/miniconda3/envs/mtp/lib/python3.8/site-packages/mmengine/dist/utils.py", line 427, in wrapper return func(args, **kwargs) File "/home/lsh/share/mmsatellite/opencd/visualization/cd_local_visualizer.py", line 114, in add_datasample pred_img_data = self._draw_sem_seg(pred_img_data, File "/home/lsh/miniconda3/envs/mtp/lib/python3.8/site-packages/mmseg/visualization/local_visualizer.py", line 140, in _draw_sem_seg mask[sem_seg[0] == label, :] = color IndexError: boolean index did not match indexed array along dimension 0; dimension is 256 but corresponding boolean dimension is 1024

mmseg.. but I want to use change detection

chartgod commented 3 weeks ago

I have said, please manually clip the dataset to 256256, as the paper mentioned. <<Is there a source code that preprocesses to 256 256 in the storage above?

chartgod commented 3 weeks ago

Is it the same for training?