OspofeDeveloper / MusicAlignApp

2 stars 1 forks source link

Output format for the annotations #1

Open ptorras opened 1 month ago

ptorras commented 1 month ago

The annotations made by the application could employ a slightly modified version of the COCO format, so that transitioning into a COCO-Compliant spec is straightforward

Output File Structure

A single json file per project (aligned page). Each cropped line is considered a different image. The COCO JSON file should contain:

{
    "info": info,               
    "images": [image],            // List of images per line
    "annotations": [annotation],  // List of annotations for the full project
    "licenses": [license],     
    "categories"
}

info{
    "year": int,               
    "version": str,            // 0.1.0 for now
    "description": str,        // Fixed description for the dataset
    "contributor": str,        // Email of the annotator/username
    "url": str,                // Empty for now
    "date_created": datetime,  // Date of creation of the project
}

image{ // Represents a single line crop of music
    "id": int,                    // Crop number as image identifier
    "width": int,                 
    "height": int,
    "file_name": str,             // Path to the image
    "license": int,               // Image license
    "flickr_url": str,            // Ignore
    "coco_url": str,              // Ignore
    "date_captured": datetime,   // Ignore
    // Create these two: representing the origin of the cropped image in the page image
    "origin_x": int,
    "origin_y": int,
}

license{
    "id": int,                    // All images should have the same license, we can set one for now
    "name": str,
    "url": str,
}

annotation{
    "id": int, // str                 // Identifier for the annotation. In coco, it should be an int. We use str.
                                      // We can change it by a str and use the id in the SVG file
    "image_id": int,                  // Identifier of the annotated image (the line id)
    "category_id": int,               // Identifier for the class of object
    "segmentation": RLE or [polygon], // The polygon as generated by the application
    "area": float,                    // Valor de l'area de la bounding box
    "bbox": [x,y,width,height],       // Bounding box
    "iscrowd": 0 or 1,                // 0 always
}

categories[{
    "id": int,                        // Unique id for each class
    "name": str,                      // Underlying name for the class
    "supercategory": str,             // Not relevant - same for everything ("music_obj")
}]
ptorras commented 1 month ago

It should be noted that the ID for the annotations must be preceded by the ID of the image line, otherwise they will collide within a project