HumanSignal / label-studio-converter

Tools for converting Label Studio annotations into common dataset formats
https://labelstud.io/
255 stars 132 forks source link

Export to csv doesnt work for multiline value #185

Open igolant opened 1 year ago

igolant commented 1 year ago

If some data field contains multiline - export produces incorrect csv

pip freeze | grep label-studio-converter
label-studio-converter==0.0.48
from label_studio_converter import Converter
import json

rows = [
 {
     'id': 684199,
     'annotations': [],
     'drafts': [],
     'predictions': [],
     'data': {'html': '<table id="main_table" class="main_table">\n </table>'},
     'meta': {},
     'created_at': '2022-12-28T19:48:50.071036Z',
     'updated_at': '2023-01-12T09:17:48.484287Z',
     'inner_id': 0,
     'total_annotations': 0,
     'cancelled_annotations': 0,
     'total_predictions': 0,
     'comment_count': 0,
     'unresolved_comment_count': 0,
     'last_comment_updated_at': None,
     'project': 104,
     'updated_by': 33,
     'comment_authors': [],
 }   
]
json.dump(rows, open("/tmp/rows.json", "w"))

c = Converter({}, project_dir=None)
c.convert("/tmp/rows.json", "/tmp", "CSV", is_dir=False)

produced file is incorrect csv

cat /tmp/result.csv

annotator,annotation_id,updated_at,lead_time,html,created_at,id
,,,,<table id="main_table" class="main_table">
 </table>,,684199
makseq commented 1 year ago

Can you try to use converter from this commit - 7695383bd6a84cd4b762af4d7f7f5d359050468c - will it work correctly?