HumanSignal / label-studio-converter

Tools for converting Label Studio annotations into common dataset formats
https://labelstud.io/
262 stars 130 forks source link

How to convert Relation annotation output to Spacy Binary format for relational model training? #53

Open karndeepsingh opened 3 years ago

karndeepsingh commented 3 years ago

Hi, Please help to convert the Label studio output after annotation of documents for relation model. Spacy accepts "token_Start" and "token_end" with "Start" and "End" key values. But Label studio only outputs Start and End key and No "token_start" and "Token_end " is keyed. Please help me to get "token_start" and "token_end" from the annotated dataset or help me to convert this label studio output to spacy binary file for training relation model. Following is the snap of output from Label studio.:

[
  {
    "id": 6,
    "annotations": [
      {
        "id": 3,
        "completed_by": {
          "id": 2,
          "email": "kdsinghsdl@gmail.com",
          "first_name": "",
          "last_name": ""
        },
        "result": [
          {
            "value": {
              "start": 9,
              "end": 63,
              "text": "Synergy One Lending, Inc. dba Mutual of Omaha Mortgage",
              "labels": [
                "PARTY NAME"
              ]
            },
            "id": "fkVqYYR_P7",
            "from_name": "label",
            "to_name": "text",
            "type": "labels"
          },
          {
            "value": {
              "start": 0,
              "end": 8,
              "text": "BORROWER",
              "labels": [
                "PARTY ROLE"
              ]
            },
            "id": "62m6jvJopr",
            "from_name": "label",
            "to_name": "text",
            "type": "labels"
          },
          {
            "value": {
              "start": 64,
              "end": 102,
              "text": "5716 Corsa Avenuew, Sulle 102 Westlake",
              "labels": [
                "PARTY ADDRESS"
              ]
            },
            "id": "ZMBV98QR7N",
            "from_name": "label",
            "to_name": "text",
            "type": "labels"
          },
          {
            "from_id": "62m6jvJopr",
            "to_id": "fkVqYYR_P7",
            "type": "relation",
            "direction": "right",
            "labels": [
              "ROLE"
            ]
          },
          {
            "from_id": "ZMBV98QR7N",
            "to_id": "fkVqYYR_P7",
            "type": "relation",
            "direction": "right",
            "labels": [
              "ADDRESS"
            ]
          }
        ],
        "was_cancelled": false,
        "ground_truth": false,
        "created_at": "2021-09-04T08:14:02.157201Z",
        "updated_at": "2021-09-04T08:14:02.157201Z",
        "lead_time": 80162.164,
        "prediction": {},
        "result_count": 0,
        "task": 6
      }
    ],
    "predictions": [],
    "file_upload": "New_Text_Document_2_E6Don9E.txt",
    "data": {
      "text": "BORROWER Synergy One Lending, Inc. dba Mutual of Omaha Mortgage 5716 Corsa Avenuew, Sulle 102 Westlake village."
    },
    "meta": {},
    "created_at": "2021-09-03T09:54:27.365638Z",
    "updated_at": "2021-09-03T09:54:27.365638Z",
    "project": 6
  }
]
khushal2405 commented 1 year ago

@karndeepsingh , anything on this? Did you find any label studio to spacy converter?