Closed dm03514 closed 3 years ago
Thanks for the detailed bug report @dm03514! I'll have a look at what's gone wrong here
Ok, it appears the issue here is two things. First, if you define your partitions
like so:
vals:
- 00
- 01
- ...
- 23
There is a risk of YAML interpreting those values as integers instead of strings. It's safer to do:
vals:
- '00'
- '01'
- '...'
- '23'
Even then, there is a bug with the new way this package handles Redshift partitions that coerces strings like '00'
to numeric 0
, when returned from the database and loaded by agate. I'll open a PR with a proposed fix.
Describe the bug
We are using redshift spectrum external tables and we have partitions of the form:
/YYYYmmdd/HH
In v0.5.0 we generate the hourly partitions using:
We then execute:
In v0.5.0 this will create 24 partitions per day which include leading 0s, i.e.
00
,01
After upgrading to v0.6.0 the same command generates partitions and strips the leading 0s. This means we are generating partitions that don't map to actual directories, i.e. our partition is
But dbt is generating the partition without the leading 0.
Steps to reproduce
00
->23
stage_external_sources
select * from SVV_EXTERNAL_PARTITIONS where tablename = '$TABLENAME' order by values desc;
stage_external_sources
SVV_EXTERNAL_PARTITIONS
and verify that leading 0s were stripped.Expected results
Leading 0s are preserved in 0.6.0, or there is a config option to allow them to be preserved.
Actual results
Leading 0s are stripped from partitions when using 0.6.0.
Screenshots and log output
This shows the leading 0's being stripped as soon as we upgraded to version 0.6.0:
System information
The contents of your
packages.yml
file:Which database are you using dbt with?
The output of
dbt --version
:The operating system you're using:
The output of
python --version
:Thank you!