deephaven / deephaven-core

Deephaven Community Core
Other
252 stars 80 forks source link

Iceberg column rename handling #6118

Open devinrsmith opened 1 week ago

devinrsmith commented 1 week ago

Deephaven does not currently handle Iceberg column renames.

Here is an example to produce a set of data that illustrates the issue https://gist.github.com/devinrsmith/a6d537e5dce3ff0b25ba4b9c62d80ada which will first create the table, add some data, rename a column, and add some more data.

When read into Deephaven, only the latest data files have their data represented:

Screenshot from 2024-09-24 10-02-30

devinrsmith commented 6 days ago

https://github.com/apache/iceberg/blob/main/format/spec.md#schema-evolution should be consulted in general for issues about how schema evolution and tracking is supposed to happen.

https://www.mail-archive.com/dev@iceberg.apache.org/msg02236.html also talks a bit about Iceberg field-id and Parquet field_id.

devinrsmith commented 1 day ago

Potentially related to #6124