dbt-labs / dbt-athena

The athena adapter plugin for dbt (https://getdbt.com)
https://dbt-athena.github.io
Apache License 2.0
228 stars 100 forks source link

feat: Update `AthenaColumn` to parse array types #672

Closed jeancochrane closed 5 months ago

jeancochrane commented 5 months ago

Description

AthenaColumn is used to parse column metadata returned by Glue, particularly in AthenaAdapter.get_columns_in_relation(). The AthenaColumn.data_type property method is configured to convert a number of DDL data types to their DML equivalent (e.g. string ➡️ varchar), but it is not yet configured to convert array types, meaning that any code that consumes column definitions from get_columns_in_relation can raise errors like this one:

TYPE_MISMATCH: Unknown type: array<string>

This is causing problems for us as we begin adopting unit tests, since unit tests use get_columns_in_relation() in order to pull schema metadata for use in populating fixture data.

Models used to test - Optional

In addition to the unit tests I added as part of this PR, I manually tested this change against my team's dbt unit test suite to confirm that it resolves the TYPE_MISMATCH error. I'd be happy to pull together a minimal reproducible example if it would be useful.

Checklist

nicor88 commented 5 months ago

Overall looks good to me, also the unit test suite seems quite complete.

@Jrmyy or @svdimchenko would you like to have a a look?