datacontract / datacontract-cli

CLI to manage your datacontract.yaml files
https://cli.datacontract.com
Other
427 stars 85 forks source link

Fix BigQuery Array Type #431

Open ffernandez92 opened 7 hours ago

ffernandez92 commented 7 hours ago

As discussed in #421, there is a bug in how the BigQuery schema is converted from the DataContract schema to the BigQuery schema, specifically when handling arrays.

Example of the current behavior:

models:
  app: # matches with bigquery table name
    description: App dimensions
    type: table
    fields:
      (..others..)
      test_array:
        type: array
        items:
          type: string
        description: none.

Resulting BQ schema:

{
        "name": "test_array",
        "type": "RECORD",
        "mode": "REPEATED",
        "description": "none.",
        "fields": [
          {
            "name": "test_array_1",
            "type": "STRING",
            "mode": "NULLABLE",
            "description": "",
            "maxLength": null
          }
        ]
      }

The correct BQ schema in the previous case should be:

{
        "mode": "REPEATED",
        "name": "test_array",
        "type": "STRING"
        "description": "none."
      },
ffernandez92 commented 7 hours ago

Created #432 that fixes the issue