Open omervk opened 5 years ago
I think what you're trying to accomplish would be done a little differently. I understand the term "partitioning function" to mean the partition transformations that are part of a partition spec.
That's not the right place to do this because we don't need to add extra representations of a date to the manifest files. Instead, a process importing files from an external source should parse the strings and produce the right data value (day ordinal from 1970-01-01=0) for the date. Then Iceberg would use the same partition code for these files.
(this is dependent upon the completion of #71 and #72)
The partition function for external mappings is derived from the parsing of the path of data files a-la Hive's format.
For instance the structure:
Would create a new column
date
with with string values2018-11-12
and2018-11-13
and assume the partitioning function isidentity(date)
instead of being able to derive it from another field (i.e. a function of the date part of atimestamp
column).Iceberg should let users specify their own partitioning function, based on existing columns.