Open oneonestar opened 5 months ago
This is a good idea, but needs to be coordinated across all applications that currently use locks. is this related to @pvary's https://github.com/apache/iceberg/pull/6648 ?
https://github.com/apache/iceberg/pull/6648 is only the refactoring, which makes https://github.com/apache/iceberg/pull/6570 possible. The later PR is the one which enables the lock-free commit.
If you enable the lock-free commit on table level, then you have to make sure, that every writer of the table uses Iceberg 1.3.0 version or later, so they will use the appropriate locking mechanism. For more details check the end of this paragraph: https://iceberg.apache.org/docs/nightly/configuration/#hadoop-configuration
Edit: Don't forget that you need the correct HMS version too.
@findepi any update on this? how we can make lock free commit using Trino? Also if a iceberg table is locked permanently, how can we unlock it?
https://github.com/apache/iceberg/pull/6570 is available now but I'm not sure if any Hive version actually includes those changes yet.
So this will need to be revisited once some OSS HMS ships with the required changes.
For HMS with HIVE-26882, we can avoid using table lock during commit to Iceberg table. This improves performance of concurrent write to iceberg table and reduce the chance of having an unreleased lock stuck in HMS.
https://github.com/trinodb/trino/blob/db64b8857cdbb8d8f1bfcecf48f0a83c50dce836/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/hms/HiveMetastoreTableOperations.java#L116-L127
https://github.com/apache/iceberg/pull/6570 implemented
iceberg.engine.hive.lock-enabled = false
. All writers including Trino, Spark and other engines should honor this setting to avoid using different locking mechanism, which could result to data corruption.An unreleased lock could result in the following error: