Open torvalds-dev-testbot[bot] opened 10 months ago
torvalds.dev is analyzing the ticket
This seems to be a problem related to Apache Hudi's multi-writer capability, where one of the writes is failing with the error reports given. It appears that there may be a configuration issue or perhaps some kind of concurrency problem.
Plan:
From the error logs, we can see that there is a problem with HoodieDataSourceInternalBatchWrite transaction. The error logs mention "Cannot find instant 20231219010014383 in the timeline, for rollback", which means the transaction that it's looking for in the timeline does not exist.
This might suggest that there is a concurrency issue, where perhaps two write operations are occurring around the same time and causing conflicts.
The configuration seems to be setup for multiple writers with different settings. There might be a case where one setting might conflict with another. For example, different "hoodie.write.concurrency.mode" and "hoodie.write.lock.provider" might be causing a problem.
"HoodieLockConfig.newBuilder().withLockProvider(lockProvider)" and "'hoodie.write.concurrency.mode','OPTIMISTIC_CONCURRENCY_CONTROL"
These settings control how multiple write requests are handled and if the system optimistically assumes that conflicts will not happen, it might run into problems if some conflict does occur.
It would also be worth checking if enabling and disabling metadata with the "hoodie.metadata.enable" options affects anything.
Since it looks like a concurrency issue, you might want to try to change the lock provider and write concurrency mode.
Additionally, here are some specific suggestions:
Remember to always have a backup of your data before making changes.
Describe the problem you faced
Hi, I am trying a use case to use multi writer to write data into different partitions with version 0.14. I found this medium article https://medium.com/@simpsons/can-you-concurrently-write-data-to-apache-hudi-w-o-any-lock-provider-51ea55bf2dd6 which says I can do multi writing with writer 1 having in process lock which allows to perform services and writer 2 just writing the data with services turned off. I tried with configs given and one of the writes always fails with below error: 23/12/19 01:02:06 ERROR AppendDataExec: Data source write support org.apache.hudi.spark3.internal.HoodieDataSourceInternalBatchWrite@6db6a766 is aborting. 23/12/19 01:02:06 ERROR DataSourceInternalWriterHelper: Commit 20231219010014383 aborted 23/12/19 01:02:07 WARN BaseHoodieWriteClient: Cannot find instant 20231219010014383 in the timeline, for rollback 23/12/19 01:02:07 ERROR AppendDataExec: Data source write support org.apache.hudi.spark3.internal.HoodieDataSourceInternalBatchWrite@6db6a766 aborted.
Configs Used: load_df_1.write.format("org.apache.hudi"). option("hoodie.datasource.write.recordkey.field", "xxxxxxxxxxxx"). option("hoodie.datasource.write.partitionpath.field", "xxxxxxxxxxxx"). option("hoodie.datasource.write.precombine.field", "xxxxxxxxxxxx"). option("hoodie.datasource.write.operation", "bulk_insert"). option("hoodie.datasource.write.table.type", "COPY_ON_WRITE"). option("hoodie.datasource.query.type", "snapshot"). option("spark.serializer", "org.apache.spark.serializer.KryoSerializer"). option("hoodie.datasource.write.hive_style_partitioning", "true"). option("hoodie.cleaner.policy.failed.writes","LAZY"). option("hoodie.write.concurrency.mode","OPTIMISTIC_CONCURRENCY_CONTROL"). option("hoodie.write.lock.provider","org.apache.hudi.client.transaction.lock.InProcessLockProvider"). option("hoodie.metadata.enable","false"). option(HoodieWriteConfig.TABLE_NAME, "xxxxxxxxxxxx"). mode("Overwrite"). save("xxxxxxxxxxxx")
load_df_2.write.format("org.apache.hudi"). option("hoodie.datasource.write.recordkey.field", "xxxxxxxxxxxx"). option("hoodie.datasource.write.partitionpath.field", "xxxxxxxxxxxx"). option("hoodie.datasource.write.precombine.field", "xxxxxxxxxxxx"). option("hoodie.datasource.write.operation", "bulk_insert"). option("hoodie.datasource.write.table.type", "COPY_ON_WRITE"). option("hoodie.datasource.query.type", "snapshot"). option("spark.serializer", "org.apache.spark.serializer.KryoSerializer"). option("hoodie.datasource.write.hive_style_partitioning", "true"). option("hoodie.cleaner.policy.failed.writes","LAZY"). option("hoodie.metadata.enable","false"). option("hoodie.table.services.enabled","false"). option(HoodieWriteConfig.TABLE_NAME, "xxxxxxxxxxxx"). mode("Overwrite"). save("xxxxxxxxxxxx")
Can someone help? Can this be done without using locks as per article or should I definitely use any recommended lock provider? (edited)