SHI-Labs / OneFormer

OneFormer: One Transformer to Rule Universal Image Segmentation, arxiv 2022 / CVPR 2023
https://praeclarumjj3.github.io/oneformer
MIT License
1.51k stars 134 forks source link

Extracting the label/mask for a specific category #82

Closed sparshgarg23 closed 1 year ago

sparshgarg23 commented 1 year ago

I had come across a previous example for background removal that relied on usage of semantic segmentation.However instead of removing the background,I am currently more focused on removing the pixels corresponding to a specific class.Plus,I am interested in using the concept of panoptic to approach this problem. For example in below image

download Let's say we want to white out all pixels belonging to the class grass. So how would I go about it, i am thinking i select the mask corresponding to the grass label and then subtract it from the output result's mask.

As the ADE dataset consists of a larger vocabulary,that was my first choice and I was glad to discover your work on panoptic trained on ADE. Would appreciate it if you can give me some tips on how to extract the mask corresponding to certain label . the work on background removal uses the foll code.

def decode_segmap(image, source, nc=21):

  label_colors = np.array([(0, 0, 0),  # 0=background
               # 1=aeroplane, 2=bicycle, 3=bird, 4=boat, 5=bottle
               (128, 0, 0), (0, 128, 0), (128, 128, 0), (0, 0, 128), (128, 0, 128),
               # 6=bus, 7=car, 8=cat, 9=chair, 10=cow
               (0, 128, 128), (128, 128, 128), (64, 0, 0), (192, 0, 0), (64, 128, 0),
               # 11=dining table, 12=dog, 13=horse, 14=motorbike, 15=person
               (192, 128, 0), (64, 0, 128), (192, 0, 128), (64, 128, 128), (192, 128, 128),
               # 16=potted plant, 17=sheep, 18=sofa, 19=train, 20=tv/monitor
               (0, 64, 0), (128, 64, 0), (0, 192, 0), (128, 192, 0), (0, 64, 128)])

  r = np.zeros_like(image).astype(np.uint8)
  g = np.zeros_like(image).astype(np.uint8)
  b = np.zeros_like(image).astype(np.uint8)

  for l in range(0, nc):
    idx = image == l
    r[idx] = label_colors[l, 0]
    g[idx] = label_colors[l, 1]
    b[idx] = label_colors[l, 2]

  rgb = np.stack([r, g, b], axis=2)

  # Load the foreground input image 
  foreground = cv2.imread(source)

  # Change the color of foreground image to RGB 
  # and resize image to match shape of R-band in RGB output map
  foreground = cv2.cvtColor(foreground, cv2.COLOR_BGR2RGB)
  foreground = cv2.resize(foreground,(r.shape[1],r.shape[0]))

  # Create a background array to hold white pixels
  # with the same size as RGB output map
  background = 255 * np.ones_like(rgb).astype(np.uint8)

  # Convert uint8 to float
  foreground = foreground.astype(float)
  background = background.astype(float)

  # Create a binary mask of the RGB output map using the threshold value 0
  th, alpha = cv2.threshold(np.array(rgb),0,255, cv2.THRESH_BINARY)

  # Apply a slight blur to the mask to soften edges
  alpha = cv2.GaussianBlur(alpha, (7,7),0)

  # Normalize the alpha mask to keep intensity between 0 and 1
  alpha = alpha.astype(float)/255

  # Multiply the foreground with the alpha matte
  foreground = cv2.multiply(alpha, foreground)  

  # Multiply the background with ( 1 - alpha )
  background = cv2.multiply(1.0 - alpha, background)  

  # Add the masked foreground and background
  outImage = cv2.add(foreground, background)

  # Return a normalized output image for display
  return outImage/255

the above code performs segmentation,and then uses its mask to replace the foreground's background with white color.I would like to do the same thing,but instead of setting entire background to white I am interested in setting the mask belonging to specific category to white.

sparshgarg23 commented 1 year ago

I feel the above approach may not be suitable for generalization purposes.As such ,is there a way to convert the mask to polygon and then extract the polygon for specific class. I know that in detecron2 we can convert the mask to polygon as shown below https://github.com/facebookresearch/detectron2/issues/2245 Note that issue 2245 only deals with extracting polygons for instance segmentation and isn't applicable to panoptic /semantic segmentation Any help will be appreciated. thanks

praeclarumjj3 commented 1 year ago

Hi @sparshgarg23, to extract the binary mask corresponding to a specific category. you can directly perform an if equals operation on the semantic segmentation result: https://github.com/SHI-Labs/OneFormer/blob/4962ef6a96ffb76a76771bfa3e8b3587f209752b/oneformer/oneformer_model.py#L376

specific_categiry_mask = (sem_seg == category_id).float()

If you are working with panoptic results, you can loop through the segments_info to collect the id for all masks belonging to a category_id and obtain the specific masks from the panoptic_map output. https://github.com/SHI-Labs/OneFormer/blob/4962ef6a96ffb76a76771bfa3e8b3587f209752b/oneformer/oneformer_model.py#L434

For instance segmentation, you can loop through results.pred_classes and collect masks from corresponding indices (ones that match the target category_id) in result.masks. https://github.com/SHI-Labs/OneFormer/blob/4962ef6a96ffb76a76771bfa3e8b3587f209752b/oneformer/oneformer_model.py#L486