Open oneonestar opened 1 month ago
This is a good idea, but needs to be coordinated across all applications that currently use locks. is this related to @pvary's https://github.com/apache/iceberg/pull/6648 ?
https://github.com/apache/iceberg/pull/6648 is only the refactoring, which makes https://github.com/apache/iceberg/pull/6570 possible. The later PR is the one which enables the lock-free commit.
If you enable the lock-free commit on table level, then you have to make sure, that every writer of the table uses Iceberg 1.3.0 version or later, so they will use the appropriate locking mechanism. For more details check the end of this paragraph: https://iceberg.apache.org/docs/nightly/configuration/#hadoop-configuration
Edit: Don't forget that you need the correct HMS version too.
For HMS with HIVE-26882, we can avoid using table lock during commit to Iceberg table. This improves performance of concurrent write to iceberg table and reduce the chance of having an unreleased lock stuck in HMS.
https://github.com/trinodb/trino/blob/db64b8857cdbb8d8f1bfcecf48f0a83c50dce836/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/hms/HiveMetastoreTableOperations.java#L116-L127
https://github.com/apache/iceberg/pull/6570 implemented
iceberg.engine.hive.lock-enabled = false
. All writers including Trino, Spark and other engines should honor this setting to avoid using different locking mechanism, which could result to data corruption.An unreleased lock could result in the following error: