Open francois-rd opened 1 month ago
I got the same issue.
Indeed the Croissant editor does not support nested fields yet.
You can export the json-ld for your dataset, and add them manually.
The mlcroissant python library can be used to validate your Croissant file.
Are there any plans to expand the capabilities of the editor? My dataset has a fairly complex structure and the prospect of having to manually create a file with several hundred lines of esoteric machine-friendly metadata is daunting to say the least...
Using the editor app hosted on HuggingFace (https://huggingface.co/spaces/MLCommons/croissant-editor), I'm trying to add a RecordSet to represent a nested JSON structure.
The format specification (https://docs.mlcommons.org/croissant/docs/croissant-spec.html#recordsets) seems to suggest that nested fields are possible, but the editor does not seem to support a nested data type (see image).
I was thinking about using a 'join' to another record set to build nested data, but my understanding is that 'join' is meant to cross-link files. My dataset contains standalone (not cross-linked) files each containing a series of nested JSON structures (one per instances), altogether in a JSON Lines format.