Open HawaiianSpork opened 1 week ago
ACTION NEEDED
delta-rs follows the Conventional Commits specification for release automation.
The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.
@HawaiianSpork can you add a test where we have a delta table that contains parquets with nanosecond timestamps in the files. Maybe just create a parquet table and then use convert to delta?
@HawaiianSpork can you add a test where we have a delta table that contains parquets with nanosecond timestamps in the files. Maybe just create a parquet table and then use convert to delta?
@ion-elgreco, I'd be happy to add more tests but want to make sure I create the correct ones. ../test/tests/data/table_with_edge_timestamps
data has parquet files with nanosecond timestamp precision, you can see how this change leads datafusion only seeing microsecond precision. Do you think this is fine?
FYI, @wjones127 and @roeap for making the original commit that read the schema from the parquet files: #1266.
This looks promising, but I would like to update the title if you don't mind for the changelog in the future. Schema evolution is typically understood in the Delta context as changes to the Delta schema (i.e. a transaction commit occurs).
I am understanding this correctly it's more about schema adaptation on read results
Description
By casting the read record batch to the delta schema datafusion can read tables where the underlying parquet files can be cast to the desired schema. Fixes:
This can be done now since data-fusion exposes a SchemaAdapter which can be overwritten.
We should note that this makes all times being read by delta-rs as having microsecond precision to match the Delta protocol.
Related Issue(s)