tabular-io / iceberg-kafka-connect

Apache License 2.0
177 stars 32 forks source link

Docs for setting format-version on auto-created table #142

Closed davidbzhao closed 8 months ago

davidbzhao commented 8 months ago

I was struggling to figure out how to set format-version to 2 so I could use upsert mode. It took me some sleuthing through tests and some source code before it clicked that I needed to set iceberg.tables.auto-create-props.format-version.

Thoughts on calling it out explicitly with a change to the README, maybe something like the below change?

Before

If `iceberg.tables.dynamic-enabled` is `false` (the default) then you must specify `iceberg.tables`. If
`iceberg.tables.dynamic-enabled` is `true` then you must specify `iceberg.tables.route-field` which will
contain the name of the table. Enabling `iceberg.tables.upsert-mode-enabled` will cause all appends to be
preceded by an equality delete. Both CDC and upsert mode require an Iceberg V2 table with identity fields
defined.

After

If `iceberg.tables.dynamic-enabled` is `false` (the default) then you must specify `iceberg.tables`. If
`iceberg.tables.dynamic-enabled` is `true` then you must specify `iceberg.tables.route-field` which will
contain the name of the table.

Enabling `iceberg.tables.upsert-mode-enabled` will cause all appends to be
preceded by an equality delete. Both CDC and upsert mode require an Iceberg V2 table with identity fields
defined. To set the format version of an auto-created table, set`iceberg.tables.auto-create-props.format-version`
to `2`.
bryanck commented 8 months ago

We'll be updating the sink to Iceberg 1.4 soon so this won't be needed (V2 is the default in 1.4). We had to roll back to 1.3 temporarily as there was a critical issue in 1.4.0, so we're waiting on 1.4.2. I do think having some docs on setting properties more generally would be useful though.

davidbzhao commented 8 months ago

Gotcha, sounds good on 1.4.2. Will circle back about docs as needed after that upgrade. Thanks!