Open tomasohara opened 2 years ago
@tomasohara Originally we implemented a CSV export for Choices, not for NER labels. The CSV with labels is produced automatically without any preprocessing (despite to Choices). Yes, maybe it's better to disable export altogether for everything that is not Choices. Or we should make a preprocessing for labels too.
OK, thanks for the clarification. The changes are minimal, as shown in the following comparison of the existing convert_to_csv vs. my convert_to_flat_csv: _convert_to_csv_flat-diff-8oct21.
Here's the original and revised functions: _convert_to_csv.txt and _convert_to_csv_flat.txt.
Should I make a push request? I would implement both in the same function with the new behavior governed by an environment variable (e.g., FLATTENED_CSV_ANNOTATIONS).
Sorry, I closed it by accident when adding the diff listing. Therefore, I re-opened it.
Yep, pull request would be great!
Unfortunately, the CSV output exported by Label Studio uses JSON for the label . See dog-example-project-35-at-2021-10-07-18-48-22cb3c67.csv. This makes it hard to review the data in spreadsheets,.
Instead, the label should be extracted as a simple string value, as with the other converters (e.g., CONLL). In addition, each annotation should be on a separate line. For example, 15 distinct annotations are packed into a single line in the above example!
For the expected output see the attached desired-dog-example-project-35-at-2021-10-07-18-49-22cb3c67.csv.
Note that this is not a feature request: I was baffled when I found out about this behavior. For example, why bother having a CSV format if the important part must be processed with a JSON utility?!