Closed moss-xyz closed 5 months ago
Hi @moss-xyz, this may be related to another issue. We're going to try and take a look into it along with the other.
Confirmed, same issue with YOLO here is my config .../api/projects/55/exports
{
"title": "My YOLO Export",
"task_filter_options": {
"finished": "only",
"annotated": "only"
},
"annotation_filter_options": {
"usual": true
},
"serialization_options": {
"drafts": {
"only_id": false
}
},
"export_type": "YOLO"
}
Try using this SDK script:
pip install label-studio-sdk==0.0.34
import time
from label_studio_sdk import Client
class SnapshotExporter:
def __init__(self, host, api_key):
self.ls = Client(url=host, api_key=api_key)
def export_json_snapshot(self, project_id):
""" Export JSON snapshot """
project = self.ls.get_project(project_id)
export_result = project.export_snapshot_create(title='Export SDK Snapshot')
export_id = export_result['id']
# Wait until the snapshot is ready
while project.export_snapshot_status(export_id).is_in_progress():
time.sleep(1.0)
return export_id
def convert_snapshot(self, project_id, export_id, export_type):
""" Convert JSON snapshot to specific format (YOLO, VOC, COCO, CSV, TSV, BRUSH_PNG, etc """
response = self.ls.make_request(
method='POST',
url=f'/api/projects/{project_id}/exports/{export_id}/convert',
json={'export_type': export_type}
)
return response.json()['converted_format'] # return conversion id
def wait_for_conversion(self, project_id, export_id, conversion_id):
""" Wait until the conversion is completed """
project = self.ls.get_project(project_id)
while True:
exports = project.export_snapshot_list()
for export in exports:
if export['id'] == export_id:
for converted_format in export['converted_formats']:
if converted_format['id'] == conversion_id:
if converted_format['status'] == 'completed':
return
elif converted_format['status'] == 'failed':
raise Exception("Conversion failed")
time.sleep(1.0)
def download_snapshot(self, project_id, export_id, export_type):
""" Download the converted snapshot """
project = self.ls.get_project(project_id)
status, file_name = project.export_snapshot_download(export_id, export_type=export_type)
if status == 200:
return file_name
else:
raise Exception("Failed to download the snapshot")
# Usage
host = LABEL_STUDIO_URL
api_key = API_KEY
project_id = PROJECT_ID
export_type = 'VOC'
exporter = SnapshotExporter(host, api_key)
# Step 1: Export JSON snapshot
export_id = exporter.export_json_snapshot(project_id)
print(f"Exported JSON snapshot with ID: {export_id}")
# Step 2: Convert JSON snapshot to format
conversion_id = exporter.convert_snapshot(project_id, export_id, export_type)
print(f"Started conversion to {export_type} with ID: {conversion_id}")
# Step 3: Wait for conversion to complete
exporter.wait_for_conversion(project_id, export_id, conversion_id)
print("Conversion completed")
# Step 4: Download the converted snapshot
file_name = exporter.download_snapshot(project_id, export_id, export_type=export_type)
print(f"Downloaded {export_type} snapshot as: {file_name}")
export_json_snapshot
method creates a new export snapshot in JSON format and waits until the export is completed.convert_snapshot
method uses the make_request
method to manually call the conversion API endpoint.wait_for_conversion
method polls the export status until the conversion is completed or failed.download_snapshot
method downloads the converted snapshot in Pascal VOC format (or other).Make sure to replace host
, api_key
, and project_id
with your actual Label Studio host, API key, and project ID.
This is nice but fails to download the actual images. Is there a way to also get the images?
Describe the bug
I cannot export the tasks using the API endpoint for LabelStudio (
/api/projects/{number}/exports
), there are a variety of issues happening. I am specifically interested in exporting into the VOC format.To Reproduce
Note that I am exporting this using the API endpoint as the typical "Export" workflow in the GUI hangs my computer (due to its very limited CPU/RAM), and I was hoping the API would be more reliable.
Step 1: Create Export Snapshot I used this endpoint to configure an export snapshot. This did seem to work successfully, but two issues cropped up.
For reference, the JSON I passed to the data field was the following:
The first issue was that, despite using the
converted_formats
field, the export snapshot did not list any actual formats in the response I received, which just said'converted_formats': []
Second, despite setting the
task_filter_options
as listed above, the snapshot included all 4000+ images in my dataset, instead of the ~1000 annotated images.Step 2: Converting Export Snapshot Since the snapshot did not have my desired export type, I used this endpoint to convert my snapshot from the default to VOC.
Here is the JSON I passed to the data field:
This seemed to work! When I re-queried the endpoint, it did indeed say
'export_type': 'VOC'
.Step 3: Downloading the Export Finally, I used this endpoint to save the actual export.
This "works" in the sense that my GET request goes through, but "fails" in the sense that it doesn't provide the export in the format I requested (VOC), instead of the default JSON-formatted file, which I've attached here.
Expected behavior
I actually don't know what "format" the VOC is supposed to export in, but I presume that it is a zip file containing XML files and the images? If so, that didn't happen. Instead, I got a JSON-formatted file, which I've attached here (same as above).
Environment (please complete the following information):
Additional context
I was also requested to post my config XML, but I just remembered that I can't since I am away from the computer with Label-Studio loaded on to it, and will be for the rest of the month. I do remember that it was a slight modification of the Object Detection with Bounding Boxes template, just with different labels relevant to my project.