HumanSignal / label-studio-converter

Tools for converting Label Studio annotations into common dataset formats
https://labelstud.io/
255 stars 132 forks source link

Guidance needed on converting from Coco to LS #207

Closed opyate closed 1 year ago

opyate commented 1 year ago

Hello,

I've a typical Coco export:

./result.json
./images
./images/page_5.png
./images/page_9.png
./images/page_7.png
./images/page_10.png
./images/page_4.png
./images/page_2.png
./images/page_6.png
./images/page_8.png
./images/page_3.png

But I'm not sure about the command line parameters to make this work, though.

COCO=/path/to/coco/export
JSON_FILE=$COCO/result.json
IMAGES=$COCO/images

python label_studio_converter/cli.py \
    --input $JSON_FILE \
    --config config.xml \
    --image-dir=$IMAGES \
    --output /tmp/output.json \
    --format COCO

--config does not seem to be documented on the README. I tried the XML in the below screenshot as config (as per this comment):

image

...and it outputs this

$ ...
Congratulations! Now check:
/tmp/output.json

...but it's a directory:

$ file /tmp/output.json
/tmp/output.json: directory

...with these contents:

$ find /tmp/output.json/
/tmp/output.json/
/tmp/output.json/result.json

...and the result seems to be a reduced version of my Coco result.json (without the images/annotations):

$ cat /tmp/output.json/result.json
{
  "images": [],
  "categories": [
    {
      "id": 0,
      "name": "Section"
    },
    {
      "id": 1,
      "name": "Sub-section"
    }
  ],
  "annotations": [],
  "info": {
    "year": 2023,
    "version": "1.0",
    "description": "",
    "contributor": "Label Studio",
    "url": "",
    "date_created": "2023-03-16 15:37:57.226663"
  }
}

I'm not sure what I'm doing wrongly. I have a hunch it's the config.

Thanks!

makseq commented 1 year ago

@opyate

  1. --output should be directory.
  2. are you sure that convertor script sees your images? If it can't find images, no annotations will be added to output json file.
opyate commented 1 year ago

--input should be a directory also. I'm getting further, but the code raises an error on this line where it tries to get the result attribute from an annotation.

It doesn't seem as if result is part of annotation according to their schema?

Neither are completed_by, created_at, updated_at, lead_time.

makseq commented 1 year ago

Sorry for a late answer, I've just found that you are trying to convert from LS to COCO, instead of COCO => LS.

Check this message: https://github.com/heartexlabs/label-studio/issues/2806#issuecomment-1210000572

and this YOLO => LS guide: https://github.com/heartexlabs/label-studio-converter#yolo-to-label-studio-converter COCO should be similar.

opyate commented 1 year ago

For anyone who stumble upon this in the future.

Given:

COCO=/path/to/coco/export
JSON_FILE=$COCO/result.json
IMAGES=$COCO/images

Conversion:

git clone https://github.com/heartexlabs/label-studio-converter.git
cd label-studio-converter
label-studio-converter import coco \
        -i $JSON_FILE \
        --image-root-url=/data/local-files/?d=images \
        -o output.json

Run label studio:

LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true \
LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=$COCO \
label-studio

Follow the steps spat out by the conversion IN ORDER (repeated here):

  1. Create a new project in Label Studio
  2. Use Labeling Config from "/path/to/your/local/git/clone/of/label-studio-converter/output.label_config.xml"
  3. Setup serving for images [e.g. you can use Local Storage (or others): https://labelstud.io/guide/storage.html#Local-storage]
  4. Import "/path/to/your/local/git/clone/of/label-studio-converter/output.json" to the project

Step 3 looks like this:

image

What is IMPORTANT is the relationship between LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=$COCO and Absolute local path in the screenshot above, and the latter has to be a sub-directory of the former, as per this cryptic bit of documentation:

image

"start from the first directory" is just the relationship I described a moment ago.

makseq commented 1 year ago

Thank you for your great instructions, I think we will use it for our docs about coco import. BTW I've updated this part of Local storage docs, hope it's more clear now.

jiangtangaaaa commented 10 months ago

Thank you for your great instructions, I think we will use it for our docs about coco import.感谢您的精彩指示,我想我们会将其用于有关可可导入的文档。 BTW I've updated this part of Local storage docs, hope it's more clear now.顺便说一句,我已经更新了本地存储文档的这一部分,希望现在更清楚了。

Based on your post (https://github.com/HumanSignal/label-studio/issues/2806), I have done the following python -m venv env source env/bin/activate git clone https://github.com/heartexlabs/label-studio-converter.git cd label-studio-converter pip install -e .

label-studio-converter import coco -h # just print help

label-studio-converter import coco -i your-input-file.json -o output.json

I've tagged the generated xml and put my image under /home/wjt/data/medical_pic The data source is also set, but the image still cannot be displayed Please also help me explain how to do the following image image image