GoogleCloudPlatform / datashare-toolkit

DIY commercial datasets on Google Cloud Platform
Apache License 2.0
88 stars 25 forks source link

Destination tables not respecting nullable/required modes specified in schema.json #44

Open salsferrazza opened 5 years ago

salsferrazza commented 5 years ago

While interim tables reflect the column modes specified in a supplied schema.json, these modes do not make their way to the final table definition.

To Reproduce Steps to reproduce the behavior:

  1. deploy function to your project
  2. copy last_sale config files from examples to <bucket>/bqds
  3. copy last_sale.csv to bucket as marketdata.last_sale.csv
  4. despite last_sale.schema.json having columns specified as REQUIRED, the ultimate destination table (marketdata.last_sale) shows all columns as being NULLABLE

Desired behavior is for all columns being mapped directly to the destination table to inherit the mode spec from their schema definition.

Thanks for the report @michaelwsherman

mservidio commented 5 years ago

Changes are raised in https://github.com/GoogleCloudPlatform/bq-datashare-toolkit/pull/53, however there appears to be a bug with WriteDisposition where it's truncating column mode and descriptions when using the WRITE_TRUNCATE option.