trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.49k stars 3.02k forks source link

Support lock-free commit for Iceberg using HMS #22182

Open oneonestar opened 5 months ago

oneonestar commented 5 months ago

For HMS with HIVE-26882, we can avoid using table lock during commit to Iceberg table. This improves performance of concurrent write to iceberg table and reduce the chance of having an unreleased lock stuck in HMS.

https://github.com/trinodb/trino/blob/db64b8857cdbb8d8f1bfcecf48f0a83c50dce836/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/hms/HiveMetastoreTableOperations.java#L116-L127

https://github.com/apache/iceberg/pull/6570 implemented iceberg.engine.hive.lock-enabled = false. All writers including Trino, Spark and other engines should honor this setting to avoid using different locking mechanism, which could result to data corruption.

An unreleased lock could result in the following error:

Query 20240528_062551_35616_6hrf3 failed: Timed out waiting for lock 46108 for query 20240528_062551_35616_6hrf3
io.trino.spi.TrinoException: Timed out waiting for lock 46108 for query 20240528_062551_35616_6hrf3
    at io.trino.plugin.hive.metastore.thrift.ThriftHiveMetastore.acquireLock(ThriftHiveMetastore.java:1784)
    at io.trino.plugin.hive.metastore.thrift.ThriftHiveMetastore.acquireTableExclusiveLock(ThriftHiveMetastore.java:1765)
    at io.trino.plugin.iceberg.catalog.hms.HiveMetastoreTableOperations.commitToExistingTable(HiveMetastoreTableOperations.java:66)
    at io.trino.plugin.iceberg.catalog.AbstractIcebergTableOperations.commit(AbstractIcebergTableOperations.java:171)
    at org.apache.iceberg.BaseTransaction.lambda$commitSimpleTransaction$3(BaseTransaction.java:417)
    at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
    at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
    at org.apache.iceberg.BaseTransaction.commitSimpleTransaction(BaseTransaction.java:413)
    at org.apache.iceberg.BaseTransaction.commitTransaction(BaseTransaction.java:308)
    at io.trino.plugin.iceberg.IcebergMetadata.finishInsert(IcebergMetadata.java:1016)
...
findepi commented 5 months ago

This is a good idea, but needs to be coordinated across all applications that currently use locks. is this related to @pvary's https://github.com/apache/iceberg/pull/6648 ?

pvary commented 5 months ago

https://github.com/apache/iceberg/pull/6648 is only the refactoring, which makes https://github.com/apache/iceberg/pull/6570 possible. The later PR is the one which enables the lock-free commit.

If you enable the lock-free commit on table level, then you have to make sure, that every writer of the table uses Iceberg 1.3.0 version or later, so they will use the appropriate locking mechanism. For more details check the end of this paragraph: https://iceberg.apache.org/docs/nightly/configuration/#hadoop-configuration

Edit: Don't forget that you need the correct HMS version too.

arghya18 commented 1 week ago

@findepi any update on this? how we can make lock free commit using Trino? Also if a iceberg table is locked permanently, how can we unlock it?

hashhar commented 1 day ago

https://github.com/apache/iceberg/pull/6570 is available now but I'm not sure if any Hive version actually includes those changes yet.

So this will need to be revisited once some OSS HMS ships with the required changes.