Closed imemmul closed 1 year ago
Hi, for binary segmentation task, this doc may help you.
I actually used this documentation when setting up the configs. I tried to use cross entropy with sigmoid function with num_classes = 2 out_channels = 1. What's wrong ?
OK, so what happened if num_classes=2, out_channels=2 and use_sigmoid=False in CrossEntropyLoss
?
In my opinion, the phenomenon loss is decreasing dramatically I can see that instead of learning
is caused by imbalanced ratio of foreground and background.
Hi, I face the same issue with Segmenter binary segmentation trained on custom binary dataset, but I do not have the same Segmenter config so far.
After running inference I have image filled with pixel of value 1, 1 being my foreground class. Even using the validation set as inference I get the same.
My custom dataset is quite balanced, around 30 % of foreground if I sum all images. My annotations image are not RGB but grey-scale image, I followed the corresponding comment on custom dataset documentation:
The annotations are images of shape (H, W), the value pixel should fall in range [0, num_classes - 1]. You may use 'P' mode of [pillow](https://pillow.readthedocs.io/en/stable/handbook/concepts.html#palette) to create your annotation image with color.
During the training my loss is also decreasing dramatically (see my previously raised issue)
I will try a new training after changing my config accordingly to the last comment of this post.
I think it is improper to use ignore_index = 0
, as this option is for the pixels we don't care about their prediction and the loss function will not calculate the loss at these pixels. However, you need to identify whether the pixels are 0 background or 1 eddies.
In mmseg, the default value of ignore_index is 255 which is far from the category indices of the common dataset annotation, and I think the 0 index in your dataset is a category index that you cannot ignore, as above mention that you need to identify whether the pixels are 0 background or 1 eddies.
Hi thank you for your help,
In my case I didn't use ignore_index = 0
but my main error was to set reduce_zero_label=True
for the binary segmentation task (I didn't change it from my copy of configs/segmenter/segmenter_vit-l_mask_8x1_640x640_160k_ade20k.py
that I used to create my custom config).
Now using this config file, (following MengzhangLI last comment and correcting reduce_zero_label
):
_base_ = [
'../_base_/models/segmenter_vit-b16_mask.py',
'../_base_/datasets/BBLdataset.py', '../_base_/default_runtime.py',
'../_base_/schedules/schedule_BBL_160k.py'
]
checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/segmenter/vit_large_p16_384_20220308-d4efb41d.pth' # noqa
model = dict(
pretrained=checkpoint,
backbone=dict(
type='VisionTransformer',
img_size=(640, 640),
embed_dims=1024,
num_layers=24,
num_heads=16),
decode_head=dict(
type='SegmenterMaskTransformerHead',
in_channels=1024,
channels=1024,
num_classes=2,
out_channels=2,
num_heads=16,
dropout_ratio=0.0,
embed_dims=1024,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0, avg_non_ignore=True)),
test_cfg=dict(mode='slide', crop_size=(640, 640), stride=(608, 608)))
optimizer = dict(lr=0.001, weight_decay=0.0)
img_norm_cfg = dict( # This img_norm_cfg is widely used because it is mean and std of ImageNet 1K pretrained model
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
crop_size = (640, 640)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', reduce_zero_label=False),
dict(type='Resize', img_scale=(2560, 640), ratio_range=(0.5, 2.0)),
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
dict(type='RandomFlip', prob=0.5),
dict(type='PhotoMetricDistortion'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_semantic_seg']),
]
val_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(2560, 640),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(2560, 640),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
]),
]
data = dict(
# num_gpus: 8 -> batch_size: 8
samples_per_gpu=1,
train=dict(pipeline=train_pipeline),
val=dict(pipeline=val_pipeline),
test=dict(pipeline=test_pipeline))
And running inference on the validation dataset (I don't have test dataset yet) from the checkpoint obtained during training, I end up with a binary segmentation as output!
The only thing is my predicted background class is now set with pixel value 1 (instead of 0 in training dataset) and my predicted foreground class is set with pixel value 2 (instead of 1 in training dataset). It is not a problem, I am just wondering what is the reason.
Maybe the Binary Segmentation
+ reduce_zero_label
parts of the FAQ
could be included in the main documentation?
I will run inference on a test dataset once I have it, I will comment here if there is any resulting issue but there is no reason.
Thank you for your help
Hi @lucas-sancere, my reply is for the issue. Actually, I didn't know what your config is, when I reply.
Moreover, whether set reduce_zero_label=True
depends on the dataset and the task.
If dataset has 3 category (0, 1, 2) and the task is binary segmentation and the pixels with label 0 should be ignored, just set reduce_zero_label=True
, and the pixel with label 0 will be ignored, we just identify the pixels with label 1 and 2.
Hi thank you for your help,
In my case I didn't use
ignore_index = 0
but my main error was to setreduce_zero_label=True
for the binary segmentation task (I didn't change it from my copy ofconfigs/segmenter/segmenter_vit-l_mask_8x1_640x640_160k_ade20k.py
that I used to create my custom config).Now using this config file, (following MengzhangLI last comment and correcting
reduce_zero_label
):_base_ = [ '../_base_/models/segmenter_vit-b16_mask.py', '../_base_/datasets/BBLdataset.py', '../_base_/default_runtime.py', '../_base_/schedules/schedule_BBL_160k.py' ] checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/segmenter/vit_large_p16_384_20220308-d4efb41d.pth' # noqa model = dict( pretrained=checkpoint, backbone=dict( type='VisionTransformer', img_size=(640, 640), embed_dims=1024, num_layers=24, num_heads=16), decode_head=dict( type='SegmenterMaskTransformerHead', in_channels=1024, channels=1024, num_classes=2, out_channels=2, num_heads=16, dropout_ratio=0.0, embed_dims=1024, loss_decode=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0, avg_non_ignore=True)), test_cfg=dict(mode='slide', crop_size=(640, 640), stride=(608, 608))) optimizer = dict(lr=0.001, weight_decay=0.0) img_norm_cfg = dict( # This img_norm_cfg is widely used because it is mean and std of ImageNet 1K pretrained model mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) crop_size = (640, 640) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', reduce_zero_label=False), dict(type='Resize', img_scale=(2560, 640), ratio_range=(0.5, 2.0)), dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), dict(type='RandomFlip', prob=0.5), dict(type='PhotoMetricDistortion'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']), ] val_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(2560, 640), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(2560, 640), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]), ] data = dict( # num_gpus: 8 -> batch_size: 8 samples_per_gpu=1, train=dict(pipeline=train_pipeline), val=dict(pipeline=val_pipeline), test=dict(pipeline=test_pipeline))
And running inference on the validation dataset (I don't have test dataset yet) from the checkpoint obtained during training, I end up with a binary segmentation as output!
The only thing is my predicted background class is now set with pixel value 1 (instead of 0 in training dataset) and my predicted foreground class is set with pixel value 2 (instead of 1 in training dataset). It is not a problem, I am just wondering what is the reason.
Maybe the
Binary Segmentation
+reduce_zero_label
parts of theFAQ
could be included in the main documentation?I will run inference on a test dataset once I have it, I will comment here if there is any resulting issue but there is no reason.
Thank you for your help
Hi, Lucas, how many classes in your BBLdataset
? For example, in binary segmentation dataset DRIVE, it has two types background
and vessel
.
It is wired your predicted foreground class is set with pixel value 2 because if you only have two categories, only value 0 and 1 are defined. So can you show more details about your BBLdataset
?
Best,
Hi @lucas-sancere, my reply is for the issue. Actually, I didn't know what your config is, when I reply.
Moreover, whether set
reduce_zero_label=True
depends on the dataset and the task. If dataset has 3 category (0, 1, 2) and the task is binary segmentation and the pixels with label 0 should be ignored, just setreduce_zero_label=True
, and the pixel with label 0 will be ignored, we just identify the pixels with label 1 and 2.
Hi @MeowZheng, I knew your response was about the issue itself and not my comment (it is why I precised "In my case"). However I think it is relevant to comment here even if the config file is slightly different because I encountered the exact same issue as @imemmul .
Thank you for adding explanations about reduce_zero_label=True
! In my case I had only 0 and 1 as values so I was discarding the 0s whereas it is not correct.
Hi @MengzhangLI ,
My dataset has 2 categories, tumor regions
and normal tissue
. normal tissue
class is my background class, filled with pixel of value 0, whereas my tumor regions
class is my foreground class, filled with pixel of value 1.
It is the exact same format as DRIVE
in this regard.
My input data are RGB images in TIFF format and my annotation data are PNG files one channel grey-level images filled only with 0s and 1s.
@lucas-sancere Did you solve your issue ? I am facing same issue. My dataset has 2 classes ('background' , 'building') , img is rgb.jpg while labels is .png where 0 is background and 128 is object.
@lucas-sancere Did you solve your issue ? I am facing same issue. My dataset has 2 classes ('background' , 'building') , img is rgb.jpg while labels is .png where 0 is background and 128 is object.
Hi @aymanaboghonim I am not sure you have the same issue as me. The dataset you are talking about is your training dataset?
Following the documentation:
The annotations are images of shape (H, W), the value pixel should fall in range [0, num_classes - 1]
Your training dataset should contains pixel of value 1 for your classbuilding
, not 128.
If you are talking about the pixel values for the inference output, no I didn't solve the issue I still have different pixel values for the prediction (1 for background and 2 for foreground) than in my training set (0 for background and 1 for foreground), but of course it is not a real problem.
@lucas-sancere Thanks for you reply. sorry, labels is in range [0, 1] not [0, 128]. could you please help me to solve my issue. here is my class to register custom dataset.
`classes = ('background', 'building')
palette = [[225, 228, 128], [50, 50, 50]]
@DATASETS.register_module()
class BuildingSegmentation(CustomDataset):
CLASSES = classes
PALETTE = palette
def __init__(self, split, **kwargs):
super().__init__(img_suffix='.jpg', seg_map_suffix='.png',
split=split, ignore_index=255, **kwargs)
assert osp.exists(self.img_dir) and self.split is not None
` I set num_class = 2 and reduce_zero_label = False. is there anything else needed to handle my binary seg task ?? here is a sample of labels
background metrics is good but object(building) metrics is very poor.
HI @aymanaboghonim,
Sorry I cannot really help fully as I am just a new user of mmsegmentation
and I didn't participate to the dev.
To me:
set num_class = 2 and reduce_zero_label = False.
Looks fine.
Your data sample looks fine. I guess there is a specific LUT for the plot to make the binary image red and black, if not maybe the pixel value is not 1 for foreground.
Your training works and you have an issue with the metrics, so I would say the issue is probably coming from the training data (maybe imbalanced, some images with issues, bad annotations...). I hope you solved your problem since your last message.
Hello everyone, I am using Segmenter for my custom dataset. My dataset consists of 10k eddy velocities in matlab files and with binary ground truth png images. I loaded these files and converted them into 3x256x256 shape and give them to the Segmenter. I do not have any problems with dataset because I tried them in U-Net it works but while using Segmenter I got no results. Even though, loss is decreasing dramatically I can see that instead of learning, model is setting itself to the label of class needed to be learned (as a prediction i am getting white blank image with loss of "decode.loss_ce": 0.00986") but acc_seg is around 4%. Below you can see my an example from my dataset and my config. In addition to that I am using ignore_index = 0, so I have 0 for background and 1 for eddies, is it okay to use ignore_index = 0 ? Does it matter if it's 0 or 255 ? So below you can see that how it looks like when i loaded to matlab file and normalized them (because they have negative values and i think the model should not have a problem with negatives) and its ground truth This is Segmenter's config. I tried num_classes = 1, num_classes = 2, got no results. other parameters are default.
Thank you for your help!!