OpenDataServices / flatten-tool

Tools for generating CSV and other flat versions of the structured data
http://flatten-tool.readthedocs.io/en/latest/
MIT License
105 stars 15 forks source link

unflatten: arrays of strings and numbers are 'doubled' (CSV and XLSX) #426

Open duncandewhurst opened 1 year ago

duncandewhurst commented 1 year ago

The unflatten command adds an extra set of square brackets around the value of string and numeric arrays, which means that the resulting data returns an invalid type error when validated against the schema.

I'm pretty sure this is a new issue because I don't recall it being a problem when unflattening OCDS releases, in which tag is an array of strings.

I've provided a minimal example below to reproduce the issue.

Input:

array
"a,b,c"

Schema:

{
  "properties": {
    "array": {
      "type": "array",
      "items": {
        "type": "string"
      }
    }
  }
}

Command

flatten-tool unflatten -f csv -s schema.json input

Expected output:

{
    "main": [
        {
            "array": [
                  "a",
                  "b",
                  "c"
            ]
        }
    ]
}

Actual output:

{
    "main": [
        {
            "array": [
                [
                    "a",
                    "b",
                    "c"
                ]
            ]
        }
    ]
}
Bjwebb commented 1 year ago

For 1 level arrays, flatten-tool uses ;, not ,.

This has the results you want:

array
"a;b;c"
Bjwebb commented 1 year ago

Re-opening this because we should check whether this is documented correctly in the docs.

duncandewhurst commented 1 year ago

Ah, my mistake, then. It is documented, but somewhat buried under a heading that suggests it is unsupported: https://flatten-tool.readthedocs.io/en/latest/unflatten/#plain-lists-unsupported