dbt-labs / dbt-external-tables

dbt macros to stage external sources
https://hub.getdbt.com/dbt-labs/dbt_external_tables/latest/
Apache License 2.0
286 stars 115 forks source link

Feature/add ignore case option to snowflake #289

Open robby-rob-slalom opened 2 months ago

robby-rob-slalom commented 2 months ago

Description & motivation

PR for Add ignore_case option to snowflake infer schema #288

Checklist

TODO:

robby-rob-slalom commented 2 months ago

the change seems straight forward enough. however we'll need at least one test case to prove this out. perhaps even a new seed table and external parquet file to test this against?

I'm not sure where this file would go but this is what I used to generate the example mixed case parquet (edit: modified to match existing schema in public_data):

-- partition with UPPERCASE column format
COPY INTO @stage/parquet_with_inferred_schema_and_mixed_column_case
FROM (
    SELECT
        1 AS "ID",
        'FOO' AS "NAME",
        'a' AS "SECTION"
)
PARTITION BY ('section="SECTION"')
FILE_FORMAT = (TYPE = PARQUET)
HEADER = TRUE
;

-- partition with lowercase column format
COPY INTO @stage/parquet_with_inferred_schema_and_mixed_column_case
FROM (
    SELECT
        2 AS "id",
        'bar' AS "name",
        'b' AS "section"
)
PARTITION BY ('section="section"')
FILE_FORMAT = (TYPE = PARQUET)
HEADER = TRUE
;

-- partition with PascalCase column format
COPY INTO @stage/parquet_with_inferred_schema_and_mixed_column_case
FROM (
    SELECT
        3 AS "Id",
        'FooBar' AS "Name",
        'c' AS "Section"
)
PARTITION BY ('section="Section"')
FILE_FORMAT = (TYPE = PARQUET)
HEADER = TRUE
;
dataders commented 2 months ago

@robby-rob-slalom is this still a draft? or do you think it's ready to be "formally" reviewed?

robby-rob-slalom commented 2 months ago

@robby-rob-slalom is this still a draft? or do you think it's ready to be "formally" reviewed?

The code change is ready. I may need some assistance with the integration test. Looking at the /public_data folder, there could be another folder like /json_mixed_case where one of the section files has uppercase keys and another has title case keys.