Closed xicm closed 5 months ago
It looks like the MDT timeline instant action is in invalid state.
@xicm Are you using multi writer setup?
@xicm Are you using multi writer setup?
Single writer
The instant in the screenshot comes from another job, its name is not the same as the one in the stack trace.
@xicm Looks like other similar issues also https://github.com/apache/hudi/issues/10906
I recommend you to use HudiStreamer instead of Spark Structured Streaming.
@xicm I will try to reproduce it. Can you provide more details on the steps which I can follow.
@xicm Are you using multi writer setup?
Single writer
I use multiwriters for ingesting writer and offline clustering. Even they don't involve with overlapping file groups. MDT sometimes result in invalid state. For more information, in this case we don't configure multiple writers setup like ZK, is this the problem even two writers handle non overlapping file groups?
in this case we don't configure multiple writers setup like ZK
Should always configure the log provider if there are multi writers. But you are right, it is possible we make the metadata table non-blocking since 1.0 release.
but woudn't the inprocess lock provider kick in? and should avoid multiple writers to MDT. I am assuming the setup is, spark streaming w/ async compaction or clustering. A single process, but multiple thread trying to ingest to MDT. if in process lock provider is not kicking in, then its a bug.
The root cause is the deltacommit in MDT rollbacks the compaction instant(compaction in MDT is a deltacommit) in MDT.
When a compaction commits, it will create a inflight DeltaCommit in MDT, because the compaction is asynchronous, Just at this moment, if the ingestion writer begin to commit, the writer will start a new delta commit MDT. In MDT, the new deltacommit will rollback the uncompleted deltacommit(it is created by the async compaction).
Is it possible to filter the deltacommit created by compaction in MDT when we do rollback?
but woudn't the inprocess lock provider kick in? and should avoid multiple writers to MDT. I am assuming the setup is, spark streaming w/ async compaction or clustering. A single process, but multiple thread trying to ingest to MDT. if in process lock provider is not kicking in, then its a bug.
If async clustering is in the same process, we don't run into issue for now. But for multiple writes like offline clustering in another process, as indicated by @danny0405, we should have ZK lock provider to serialize MDT write.
Hi @danny0405 & @nsivabalan even I configure HMS lock provider for multiple writes, we still run into issues as when we query table, MDT corrupted.
d1f683ef-8927-4df8-9e12-b769b5980b46-0_358-1002-89404_20240410185726547.parquet: No such file or directory!
@Qiuzhuang Can you provide the lock configurations you set? Did you set hoodie.cleaner.policy.failed.writes=LAZY
@Qiuzhuang Can you provide the lock configurations you set? Did you set
hoodie.cleaner.policy.failed.writes=LAZY
Sure, here is the configuration for offline clustering:
hoodie.clustering.async.enabled=true hoodie.clustering.async.max.commits=4 hoodie.clustering.plan.strategy.target.file.max.bytes=1073741824 hoodie.clustering.plan.strategy.small.file.limit=419430400 hoodie.clustering.plan.strategy.max.num.groups=400 hoodie.clustering.execution.strategy.class=org.apache.hudi.client.clustering.run.strategy.SparkSortAndSizeExecutionStrategy hoodie.clustering.plan.strategy.sort.columns=xx1,xx2 hoodie.layout.optimize.strategy=z-order hoodie.write.concurrency.mode=optimistic_concurrency_control hoodie.write.lock.provider=org.apache.hudi.hive.transaction.lock.HiveMetastoreBasedLockProvider hoodie.write.lock.hivemetastore.database=xxx hoodie.write.lock.hivemetastore.table=locker_xxx hoodie.cleaner.policy.failed.writes=LAZY hoodie.write.concurrency.early.conflict.detection.enable=true
We are also looking into this issue with our cloud vendor as well, FYI.
https://hudi.apache.org/docs/metadata#deployment-model-b-single-writer-with-async-table-services
If we enable async table service with MDT we should config a lock.
Maybe we should set default value of hoodie.datasource.compaction.async.enable to false or make the metadata table non-blocking . It's confusing to user that single writer needs a lock by default. @danny0405
Tips before filing an issue
Have you gone through our FAQs?
Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
If you have triaged this as a bug, then file an issue directly.
Describe the problem you faced
spark structured streaming ingest data with hoodie.metadata.enable=true, the async compaction will write a DELTACOMMIT instant to MDT, as the compaction is async, the data writer will rollback the inflight delta commit in MDT, when the compaction finish, the compaction writer will find the inflight deltacommit does not exit, throw an exception.
To Reproduce
Steps to reproduce the behavior:
1. 2. 3. 4.
Expected behavior
A clear and concise description of what you expected to happen.
Environment Description
Hudi version : 0.14.1
Spark version : 3.3.2
Hive version :
Hadoop version : 3.3.0
Storage (HDFS/S3/GCS..) : hdfs
Running on Docker? (yes/no) : no
Additional context
Add any other context about the problem here.
Stacktrace