[SUPPORT]Hi, I am trying a use case to use multi writer to write data into different partitions with version 0

Describe the problem you faced Hi, I am trying a use case to use multi writer to write data into different partitions with version 0.14. I found this medium article https://medium.com/@simpsons/can-you-concurrently-write-data-to-apache-hudi-w-o-any-lock-provider-51ea55bf2dd6 which says I can do multi writing with writer 1 having in process lock which allows to perform services and writer 2 just writing the data with services turned off. I tried with configs given and one of the writes always fails with below error: 23/12/19 01:02:06 ERROR AppendDataExec: Data source write support org.apache.hudi.spark3.internal.HoodieDataSourceInternalBatchWrite@6db6a766 is aborting. 23/12/19 01:02:06 ERROR DataSourceInternalWriterHelper: Commit 20231219010014383 aborted 23/12/19 01:02:07 WARN BaseHoodieWriteClient: Cannot find instant 20231219010014383 in the timeline, for rollback 23/12/19 01:02:07 ERROR AppendDataExec: Data source write support org.apache.hudi.spark3.internal.HoodieDataSourceInternalBatchWrite@6db6a766 aborted.

Configs Used: load_df_1.write.format("org.apache.hudi"). option("hoodie.datasource.write.recordkey.field", "xxxxxxxxxxxx"). option("hoodie.datasource.write.partitionpath.field", "xxxxxxxxxxxx"). option("hoodie.datasource.write.precombine.field", "xxxxxxxxxxxx"). option("hoodie.datasource.write.operation", "bulk_insert"). option("hoodie.datasource.write.table.type", "COPY_ON_WRITE"). option("hoodie.datasource.query.type", "snapshot"). option("spark.serializer", "org.apache.spark.serializer.KryoSerializer"). option("hoodie.datasource.write.hive_style_partitioning", "true"). option("hoodie.cleaner.policy.failed.writes","LAZY"). option("hoodie.write.concurrency.mode","OPTIMISTIC_CONCURRENCY_CONTROL"). option("hoodie.write.lock.provider","org.apache.hudi.client.transaction.lock.InProcessLockProvider"). option("hoodie.metadata.enable","false"). option(HoodieWriteConfig.TABLE_NAME, "xxxxxxxxxxxx"). mode("Overwrite"). save("xxxxxxxxxxxx")

load_df_2.write.format("org.apache.hudi"). option("hoodie.datasource.write.recordkey.field", "xxxxxxxxxxxx"). option("hoodie.datasource.write.partitionpath.field", "xxxxxxxxxxxx"). option("hoodie.datasource.write.precombine.field", "xxxxxxxxxxxx"). option("hoodie.datasource.write.operation", "bulk_insert"). option("hoodie.datasource.write.table.type", "COPY_ON_WRITE"). option("hoodie.datasource.query.type", "snapshot"). option("spark.serializer", "org.apache.spark.serializer.KryoSerializer"). option("hoodie.datasource.write.hive_style_partitioning", "true"). option("hoodie.cleaner.policy.failed.writes","LAZY"). option("hoodie.metadata.enable","false"). option("hoodie.table.services.enabled","false"). option(HoodieWriteConfig.TABLE_NAME, "xxxxxxxxxxxx"). mode("Overwrite"). save("xxxxxxxxxxxx") A clear and concise description of the problem.

To Reproduce

Steps to reproduce the behavior:

1. 2. 3. 4.

Expected behavior

A clear and concise description of what you expected to happen.

Environment Description

Versions :

Additional context

Add any other context about the problem here.

Stacktrace

Add the stacktrace of the error.

It seems like you're trying to write data concurrently to different partitions in Apache Hudi using multi-writer with version 0.14, but one of the writes always fails. The error messages indicate that a commit was aborted and the system couldn't find the instant in the timeline for rollback.

Here's a plan to troubleshoot the issue:

Check the Apache Hudi version: Ensure that you're using the correct version of Apache Hudi that supports multi-writer. The feature was introduced in version 0.9.0.
Review the configurations: Make sure that the configurations for both writers are correct. The error messages suggest that there might be an issue with the commit process, so pay special attention to the configurations related to commits, such as hoodie.write.concurrency.mode, hoodie.write.lock.provider, and hoodie.auto.commit.
Inspect the data: Check the data that you're trying to write to see if there's anything unusual that might cause the write to fail. For example, check for null values, incorrect data types, or data that doesn't match the schema.
Check the system resources: Ensure that your system has enough resources (CPU, memory, disk space) to handle the write operations. If the system is running out of resources, it might cause the write operations to fail.
Look for additional error messages: The error messages you provided are quite generic and don't provide much information about the root cause of the issue. Look for additional error messages in the logs that might provide more clues.
Try a simpler use case: If you're still having trouble, try a simpler use case with just one writer and see if that works. If it does, gradually add complexity until you can reproduce the issue. This might help you identify the specific condition that causes the write to fail.
Reach out to the Apache Hudi community: If you're still stuck after trying the above steps, consider reaching out to the Apache Hudi community for help. You can post your question on the Apache Hudi mailing list or on their Slack channel. Be sure to provide as much detail as possible about your use case and the issue you're facing.

torvalds-dev / hudi

[SUPPORT]Hi, I am trying a use case to use multi writer to write data into different partitions with version 0 #52