HumanSignal / label-studio-sdk

Label Studio SDK
https://api.labelstud.io
Apache License 2.0
99 stars 61 forks source link

Pre-annotations of TextArea #120

Closed michaelachmann closed 1 year ago

michaelachmann commented 1 year ago

I ran into trouble when adding pre annotations using the SDK for textareas. Here's my initial bug report: https://github.com/heartexlabs/label-studio/issues/3941

I think there might still be a bug in the SDK causing the uploaded tasks pre-annotations to contain the following

"value": {
  "textarea": [
    "... transcription ..."
  ]
},

instead of

"value": {
  "text": [
    "... transcription ..."
  ]
},
makseq commented 1 year ago

Could you please share your SDK code where you create pre-annotations?

michaelachmann commented 1 year ago

Sure, it's basically these two steps:

  1. Create a new project and Interface:
    
    interface = """
    <View>
    <Audio name="audio" value="$audio" zoom="true" hotkey="ctrl+enter" />
    <Header value="Provide Transcription" />
    <TextArea name="transcript" toName="audio"
            rows="5" editable="true" maxSubmissions="1" />
    </View>
    """

project_name = "Audio Project" ls = Client(url=labelstudio_url, api_key=labelstudio_key) project = ls.start_project( title=project_name, label_config=interface, )

2. Upload Tasks with pre-annotations
```python
project.import_tasks(
        df_tasks.to_dict('records'),
        preannotated_from_fields=['transcript'])
makseq commented 1 year ago

it won't work this way, because preannotatored from fields work for choices only. Check this video https://labelstud.io/guide/predictions.html#Prepare-pre-annotations-for-Label-Studio and prepare your annotation json following this video tutorial.

Also check this example: https://github.com/heartexlabs/label-studio-sdk/blob/master/examples/import_preannotations/import_preannotations.py#L29