PaddlePaddle / PaddleSeg

Easy-to-use image segmentation library with awesome pre-trained model zoo, supporting wide-range of practical tasks in Semantic Segmentation, Interactive Segmentation, Panoptic Segmentation, Image Matting, 3D Segmentation, etc.
https://arxiv.org/abs/2101.06175
Apache License 2.0
8.6k stars 1.68k forks source link

多标签语义分割任务 multi-label semantic segmentation #2174

Closed Wulx2050 closed 2 years ago

Wulx2050 commented 2 years ago

PaddleSeg 里面的任务和模型好像都是默认一个像素一个标签,但是有时候会碰到一个像素有多个标签的情况,比如这里: https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/overview

这个竞赛是一个多标签语义分割(multi-label semantic segmentation)任务,有些像素同时有两个以上的标签,输出是 Hight×Width×Num_class。Num_class>=2

我没搜索到 PaddleSeg 里面相关的教程和算法,请问我怎么处理这种多标签语义分割任务?

Wulx2050 commented 2 years ago

而且现在很多数据集的标注文件使用RLE(run-length encoding,游程编码,行程长度编码)的格式,为了方便麻烦加两个编码和解码的函数到 PaddleSeg 中。

import numpy as np

# ref: https://www.kaggle.com/paulorzp/run-length-encode-and-decode
# modified from: https://www.kaggle.com/inversion/run-length-decoding-quick-start
def rle_decode(mask_rle, shape, color=1):
    """ TBD

    Args:
        mask_rle (str): run-length as string formated (start length)
        shape (tuple of ints): (height,width) of array to return 

    Returns: 
        Mask (np.array)
            - 1 indicating mask
            - 0 indicating background

    """
    # Split the string by space, then convert it into a integer array
    s = np.array(mask_rle.split(), dtype=int)

    # Every even value is the start, every odd value is the "run" length
    starts = s[0::2] - 1
    lengths = s[1::2]
    ends = starts + lengths

    # The image image is actually flattened since RLE is a 1D "run"
    if len(shape)==3:
        h, w, d = shape
        img = np.zeros((h * w, d), dtype=np.float32)
    else:
        h, w = shape
        img = np.zeros((h * w,), dtype=np.float32)

    # The color here is actually just any integer you want!
    for lo, hi in zip(starts, ends):
        img[lo : hi] = color

    # Don't forget to change the image back to the original shape
    return img.reshape(shape)

# https://www.kaggle.com/namgalielei/which-reshape-is-used-in-rle
def rle_decode_top_to_bot_first(mask_rle, shape):
    """ TBD

    Args:
        mask_rle (str): run-length as string formated (start length)
        shape (tuple of ints): (height,width) of array to return 

    Returns:
        Mask (np.array)
            - 1 indicating mask
            - 0 indicating background

    """
    s = mask_rle.split()
    starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
    starts -= 1
    ends = starts + lengths
    img = np.zeros(shape[0]*shape[1], dtype=np.uint8)
    for lo, hi in zip(starts, ends):
        img[lo:hi] = 1
    return img.reshape((shape[1], shape[0]), order='F').T  # Reshape from top -> bottom first

# ref.: https://www.kaggle.com/stainsby/fast-tested-rle
def rle_encode(img):
    """ TBD

    Args:
        img (np.array): 
            - 1 indicating mask
            - 0 indicating background

    Returns: 
        run length as string formated
    """

    pixels = img.flatten()
    pixels = np.concatenate([[0], pixels, [0]])
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
    runs[1::2] -= runs[::2]
    return ' '.join(str(x) for x in runs)
juncaipeng commented 2 years ago

PaddleSeg 里面的任务和模型好像都是默认一个像素一个标签,但是有时候会碰到一个像素有多个标签的情况,比如这里: https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/overview

这个竞赛是一个多标签语义分割(multi-label semantic segmentation)任务,有些像素同时有两个以上的标签,输出是 Hight×Width×Num_class。Num_class>=2

我没搜索到 PaddleSeg 里面相关的教程和算法,请问我怎么处理这种多标签语义分割任务?

你好,目前paddleseg不支持多标签语义分割。如果需要使用,需要手动修改模型结构、loss计算、训练和预测过程。

alexhmyang commented 1 year ago

PaddleSeg 里面的任务和模型好像都是默认一个像素一个标签,但是有时候会碰到一个像素有多个标签的情况,比如这里: https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/overview 这个竞赛是一个多标签语义分割(multi-label semantic segmentation)任务,有些像素同时有两个以上的标签,输出是 Hight×Width×Num_class。Num_class>=2 我没搜索到 PaddleSeg 里面相关的教程和算法,请问我怎么处理这种多标签语义分割任务?

你好,目前paddleseg不支持多标签语义分割。如果需要使用,需要手动修改模型结构、loss计算、训练和预测过程。

一年了 还不支持? EISeg 可以标注分割类多标签,我标完了,结果 paddleseg报错不支持?

MINGtoMING commented 1 year ago

@alexhmyang 你好,我目前正在做paddleseg多标签语义分割的支持。多标签语义分割任务中图像上的某个像素点可以同时对应多个类别(普通语义分割是只能指向一个类别),这样的话图像上的不同object之间可能会出现重叠,但EISeg现在并不支持不同object之间可以重叠这种模式。所以可以说一下你是怎样进行重叠区域d标注的吗?还有你的标注好的annotation的格式是什么样的?这样的话我可以提供更加便利的数据读取接口。谢谢!