Open benjelloun opened 2 months ago
Make annotation a first class property, so that we can clearly represent the fact that some contents of a RecordSet are annotations. You can think of an annotation as a special kind of field that annotates its container.
Here is an example of what a field-level annotation looks like:
{"@type": "cr:RecordSet", "@id": "images",
"field": [
{ "@type": "cr:Field", "@id": "images/image", ... ,
"annotation": {
"@type": "cr:Field", "@id": "images/label",
"dataType": ["sc:Text", "cr:Label"]
}
}
]
}
In this example, the annotation "images/label" applies to the field "images/image".
Annotations can also appear at the level of a RecordSet. A RecordSet level annotation applies to the entire record. For example:
{
"@type": "cr:RecordSet",
"@id": "movies",
"field": [
{ "@type": "cr:Field", "@id": "movies/movie_id", ...},
{ "@type": "cr:Field", "@id": "movies/title", ...},
{ "@type": "cr:Field", "@id": "movies/genre", ...}
],
"annotation" : {
"@type": "cr:Field", "@id": "movies/ratings",
subField: [
{ "@type": "cr:Field", "@id": "movies/ratings/user_id", ...},
{ "@type": "cr:Field", "@id": "movies/ratings/rating", ...},
]
}
}
In this example, ratings is a structured annotation that contains a user_id and a rating.
Some examples of netcdf file for hierarchical data annotation -
Add a mechanism to Croissant to define data-level annotations. Annotations are a general mechanism to attach additional information to other pieces of data. We plan to use annotations for a number of use cases, including: