datamill-co / target-redshift

A Singer.io Target for Redshift
MIT License
23 stars 17 forks source link

Automatic varchar widening #11

Open awm33 opened 5 years ago

awm33 commented 5 years ago

For a better user experience, and efficient handling of text data, target-redshift should implement automatic varchar widening, which starts off at a default (say 255) and automatically widen the varchar max length column as it observes field text lengths in incoming batches.

Steps: 1) Start with a default max length for each new varchar column. 2) On each new batch, for each table: a) Get the current length of each varchar column in the target table. b) At some point before updating the schema, find the max of the length of all the record values for every varchar column. c) If it’s past the limit (64k), warn or throw an Exception. d) If it's under the limit (64k), update the column max length during schema update.

gbachRM commented 3 years ago

Agreed! Please implement.