open-mmlab / mmsegmentation

OpenMMLab Semantic Segmentation Toolbox and Benchmark.
https://mmsegmentation.readthedocs.io/en/main/
Apache License 2.0
7.99k stars 2.57k forks source link

questions about "crop_size" and "resize" in dataset config #1983

Closed songhc8 closed 2 years ago

songhc8 commented 2 years ago

hi,

I am confused with the definition of params "crop_size" and "resize" in data config. "crop_size" controls the input size of model, is it right? what is the "img_scale" and "ratio_range" in "Resize" used for? if the size of images in my dataset is various. how can i set the "img_scale".

crop_size = (512, 512) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', reduce_zero_label=True), dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)), dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), dict(type='RandomFlip', prob=0.5), dict(type='PhotoMetricDistortion'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']), ]

songhc8 commented 2 years ago

If "img_scale" in "Resize" is used for converting the image to identical size, what "ratio_range" controls for? for example, "img_scale=(2048,512)", all image will be resize to 512*2048 first, then they will be crop by "crop size", what about "ratio_range"?

MengzhangLI commented 2 years ago

(1) img_scale is images scales for resizing. And if ratio_range is specified, a ratio will be sampled and be multiplied with img_scale. Resize has 4 types data transforms, you could find their detailed implementation from here.

(2) crop_size is used in RandomCrop data transform, which is usually implemented after Resize transform.