HumanSignal / label-studio

Label Studio is a multi-type data labeling and annotation tool with standardized output format
https://labelstud.io
Apache License 2.0
18.96k stars 2.37k forks source link

Importing pre annotated data with png masks #6553

Open Helpme-b opened 2 days ago

Helpme-b commented 2 days ago

I am new to labelstudio and i am having issues importing. I have gotten a pre-annotated dataset from a colleague that is using a different licensed software that i do not have access to so i am trying to use labelstudio. They have sent the JPG images in image folder and masks as png files in a mask folder. I have only 2 classes so label is either the item i want to mark (white pixels) or just background (black pixels). so the whole mask png is just a black box with white parts in it.

To Reproduce

  1. Have files in this path C:\Users\User\Project1\dataset\ ├── image\ │ ├── 10.JPG │ ├── 11.JPG └── mask\ ├── 10.png ├── 11.png

  2. Set env variables for label studio

set LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
set LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT="C:\\Users\\User\\Project1\\"
  1. Set source location in project settings Storage type: Local files Local path: C:\Users\User\Project1\dataset File filter regex: empty Do not sync

  2. Set Label interface

<View>
  <Image name="image" value="$image"/>

  <BrushLabels name="label" toName="image">
    <Label value="item"
    <Label value="background" 
  </BrushLabels>
</View>
  1. Import the Json file
[
  {
    "data": {
      "image": "/data/local-files/?d=dataset/image/10.JPG"
    },
    "predictions": [
      {
        "model_version": "v1",
        "result": [
          {
            "from_name": "label",
            "to_name": "image",
            "type": "brushlabels",
            "value": {
              "format": "png",
              "brushlabels": ["item", "background"],
              "mask": "/data/local-files/?d=dataset/mask/10.png"  
            }
          }
        ]
      }
    ]
  },
  {
    "data": {
      "image": "/data/local-files/?d=dataset/image/11.JPG"
    },
    "predictions": [
      {
        "model_version": "v1",
        "result": [
          {
            "from_name": "label",
            "to_name": "image",
            "type": "brushlabels",
            "value": {
              "format": "png",
              "brushlabels": ["Class1", "Class2"],
              "mask": "/data/local-files/?d=dataset/mask/11.png"
            }
          }
        ]
      }
    ]
  }
]

Expected behavior I expect to see the premade annotations on the predictions tab but it is just empty, no labels no masks. I can see the images just fine, so i assume local storage connection is okay and path names are fine. label studio does create a prediction model called "v1" as well but when i look at the prediction for the task it is empty. I want to just upload the premade masks as "predictions" so i can just copy the predictions as actual results and look at the dataset my colleague made

Have also tried to convert the pngs to rle's and base64's hoping labelstudio would recognise that but same result. Predictions are just empty masks with no labels or marks.

AbubakarSaad commented 1 day ago

Hello,

  1. Mask Field in Predictions In your JSON file, the mask field in the prediction points to a file path: "mask": "/data/local-files/?d=dataset/mask/10.png"

However, Label Studio does not support using file paths directly in the mask field. Instead, the mask should contain the actual mask data in one of the supported formats: Run-Length Encoding (RLE): Set "format": "rle" and provide the RLE-encoded mask in the rle field.

If you prefer, you can convert your PNG masks to RLE format using the label_studio_converter.brush utility.

Example:

from label_studio_converter import brush
import numpy as np
import cv2
import json
import os

def mask_to_rle(mask_path):
    mask_image = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
    mask = (mask_image > 0).astype(np.uint8)
    rle = brush.mask2rle(mask)
    return rle

data = []

image_dir = "C:\\Users\\User\\Project1\\dataset\\image"
mask_dir = "C:\\Users\\User\\Project1\\dataset\\mask"

for image_filename in image_filenames:
    mask_filename = os.path.splitext(image_filename)[0] + '.png'
    image_path = f"/data/local-files/?d=dataset/image/{image_filename}"
    mask_path = os.path.join(mask_dir, mask_filename)
    rle = mask_to_rle(mask_path)

    task = {
        "data": {
            "image": image_path
        },
        "predictions": [
            {
                "model_version": "v1",
                "result": [
                    {
                        "from_name": "label",
                        "to_name": "image",
                        "type": "brushlabels",
                        "value": {
                            "format": "rle",
                            "brushlabels": ["item"],
                            "rle": rle
                        }
                    }
                ]
            }
        ]
    }

    data.append(task)

with open('tasks.json', 'w') as outfile:
    json.dump(data, outfile)