apache / incubator-xtable

Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
https://xtable.apache.org/
Apache License 2.0
768 stars 119 forks source link

Handle nullable record key field when translating to Iceberg #366

Closed the-other-tim-brown closed 4 months ago

the-other-tim-brown commented 4 months ago

Depending on how a Hudi table is created, the record key field can be created as a nullable field. This causes issues when we try to sync to Iceberg with this field marked as an identifier field. There are two options we can consider and test: 1) Only set the Iceberg identifier field if the field is non-null and log a warning otherwise 2) Interpret the Hudi record key field as non-null. Note: This may cause issues since the schema for the parquet file will be nullable.

the-other-tim-brown commented 4 months ago

Closing since this is already handled in the code. Approach for Option 1 was taken.