Closed Jefffrey closed 3 months ago
Writer timezone seems encoded at stripe level, which is problematic if that suggests timezone for a column can vary between stripes since Arrow encodes the timezone in the datatype so would need to be consistent for all data. Need to investigate this more
It seems I misunderstood. According to:
The writer timezone in the stripe is used for regular Timestamp, as Timestamp instants are in UTC timezone.
Separate note: encoding as Timestamp(Nanoseconds) severely limits the range representable in Arrow, need to keep this in mind
See Timestamp with local time zone here https://orc.apache.org/docs/types.html