trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
9.83k stars 2.85k forks source link

Support lock-free commit for Iceberg using HMS #22182

Open oneonestar opened 1 month ago

oneonestar commented 1 month ago

For HMS with HIVE-26882, we can avoid using table lock during commit to Iceberg table. This improves performance of concurrent write to iceberg table and reduce the chance of having an unreleased lock stuck in HMS.

https://github.com/trinodb/trino/blob/db64b8857cdbb8d8f1bfcecf48f0a83c50dce836/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/hms/HiveMetastoreTableOperations.java#L116-L127

https://github.com/apache/iceberg/pull/6570 implemented iceberg.engine.hive.lock-enabled = false. All writers including Trino, Spark and other engines should honor this setting to avoid using different locking mechanism, which could result to data corruption.

An unreleased lock could result in the following error:

Query 20240528_062551_35616_6hrf3 failed: Timed out waiting for lock 46108 for query 20240528_062551_35616_6hrf3
io.trino.spi.TrinoException: Timed out waiting for lock 46108 for query 20240528_062551_35616_6hrf3
    at io.trino.plugin.hive.metastore.thrift.ThriftHiveMetastore.acquireLock(ThriftHiveMetastore.java:1784)
    at io.trino.plugin.hive.metastore.thrift.ThriftHiveMetastore.acquireTableExclusiveLock(ThriftHiveMetastore.java:1765)
    at io.trino.plugin.iceberg.catalog.hms.HiveMetastoreTableOperations.commitToExistingTable(HiveMetastoreTableOperations.java:66)
    at io.trino.plugin.iceberg.catalog.AbstractIcebergTableOperations.commit(AbstractIcebergTableOperations.java:171)
    at org.apache.iceberg.BaseTransaction.lambda$commitSimpleTransaction$3(BaseTransaction.java:417)
    at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
    at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
    at org.apache.iceberg.BaseTransaction.commitSimpleTransaction(BaseTransaction.java:413)
    at org.apache.iceberg.BaseTransaction.commitTransaction(BaseTransaction.java:308)
    at io.trino.plugin.iceberg.IcebergMetadata.finishInsert(IcebergMetadata.java:1016)
...
findepi commented 1 month ago

This is a good idea, but needs to be coordinated across all applications that currently use locks. is this related to @pvary's https://github.com/apache/iceberg/pull/6648 ?

pvary commented 1 month ago

https://github.com/apache/iceberg/pull/6648 is only the refactoring, which makes https://github.com/apache/iceberg/pull/6570 possible. The later PR is the one which enables the lock-free commit.

If you enable the lock-free commit on table level, then you have to make sure, that every writer of the table uses Iceberg 1.3.0 version or later, so they will use the appropriate locking mechanism. For more details check the end of this paragraph: https://iceberg.apache.org/docs/nightly/configuration/#hadoop-configuration

Edit: Don't forget that you need the correct HMS version too.