dbt-labs / dbt-external-tables

dbt macros to stage external sources
https://hub.getdbt.com/dbt-labs/dbt_external_tables/latest/
Apache License 2.0
297 stars 119 forks source link

Delimiter options? #72

Closed sgoley closed 3 years ago

sgoley commented 3 years ago

Similar to snowflake's "create external table" features, is it possible to utilize input parameters for:

Further documentation: https://docs.snowflake.com/en/sql-reference/sql/create-external-table.html#type-csv

Current functionality the yml spec: format: csv

-Benefit- Anyone who uses the csv feature of the dbt-external-tables module

jtcohen6 commented 3 years ago

Hey @sgoley! The file_format property, used as the input to create external table statements, accepts a string of any length. So today, you could pass it:

sources:
  - name: my_external_source
    tables:
      - name: my_external_tbl
        external:
          location: "@my_stage"
          file_format: "( type = csv field_delimiter = 'aa' record_delimiter = 'aabb' )"

The package macros will template this out as:

create or replace external table my_external_source.my_external_tbl
  with location = '@my_stage'
  file_format = ( type = csv field_delimiter = 'aa' record_delimiter = 'aabb' )

I know that's not the prettiest specification, so maybe we could consider adding support for dict-style file_format as well:

sources:
  - name: my_external_source
    tables:
      - name: my_external_source
        external:
          location: "@my_stage"
          file_format:
            type: csv
            field_delimiter: aa
            record_delimiter: aabb

If file_format is a dictionary, the package macros will handle templating it out as a key=value list, the way Snowflake wants it:

file_format = ( type = csv field_delimiter = 'aa' record_delimiter = 'aabb' )

If it's a string, the package macros would continue to template it exactly as it is.

What do you think?

sgoley commented 3 years ago

That's fantastic. Just didn't see it documented anywhere! Is there any pages here or in the docs I could start contributing these options to?

jtcohen6 commented 3 years ago

That's fair. This package's documentation is a combination of: