datafusion-contrib / datafusion-orc

Implementation of Apache ORC file format use Apache Arrow in-memory format
Apache License 2.0
28 stars 8 forks source link

Add support for decoding Timestamp as Decimal128 #96

Closed progval closed 1 week ago

progval commented 1 week ago

This allows support for the full range of ORC timestamp, with full precision.

progval commented 1 week ago

I needed to add Some(UTC) => Box::new(iter), to avoid overflows in the main Some(writer_tz) => branch. Unfortunately, this means the Some(writer_tz) => branch does not have automated tests anymore (though tests did run when removing the first branch).

I don't think pyorc supports writing non-UTC timezones, so I don't see how to write a test file for that case. Should I leave it untested? Or just return an error on non-UTC writer timezone?