apache / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
5.36k stars 2.42k forks source link

[SUPPORT] Hudi Application getting stuck when Async cleaner is spawned #7364

Open BalaMahesh opened 1 year ago

BalaMahesh commented 1 year ago

Tips before filing an issue

Describe the problem you faced

After some failed commits, Hudi application thread-pool is stuck while acquiring the lock without any progress.

A clear and concise description of the problem.

When Hudi application is stopped with some error or ungracefully for some time, when it is restarted it is getting stuck with the below logs without any progress for ~115 minutes.

To Reproduce

Steps to reproduce the behavior:

  1. Start Hudi application and run it for sometime
  2. Kill application in the middle of the commit and then restart again
  3. Use InProcessLockProvider and enable async cleaning and compaction.

Expected behavior Pipeline should not stuck and run without error. A clear and concise description of what you expected to happen.

Environment Description

Additional context

config:

cleaner config

hoodie.cleaner.policy=KEEP_LATEST_COMMITS
hoodie.clean.automatic=true
hoodie.clean.async=true
hoodie.cleaner.commits.retained=5
hoodie.keep.min.commits=10
#compaction config
hoodie.datasource.compaction.async.enable=true
hoodie.parquet.small.file.limit=1048576
hoodie.compaction.target.io=50
hoodie.metadata.metrics.enable=true
#metadata table
hoodie.metadata.index.bloom.filter.enable=false
hoodie.metadata.index.column.stats.enable=false
#Single writer with async table services
hoodie.write.concurrency.mode=optimistic_concurrency_control
hoodie.cleaner.policy.failed.writes=LAZY
hoodie.write.lock.provider=org.apache.hudi.client.transaction.lock.InProcessLockProvider
hoodie.write.lock.wait_time_ms=300000

Add any other context about the problem here.

Stacktrace

22/12/02 08:39:31 INFO pool-31-thread-1 BaseHoodieWriteClient: Async cleaner has been spawned. Waiting for it to finish
22/12/02 08:39:31 INFO pool-31-thread-1 AsyncCleanerService: Waiting for async clean service to finish
22/12/02 08:39:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:39:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:39:55 INFO dispatcher-BlockManagerMaster BlockManagerInfo: Removed broadcast_48_piece0 on hudi-prod-data-mor-b17eab84d1f75c1f-driver-svc.spark-jobs.svc:7079 in memory (size: 36.2 KiB, free: 1663.2 MiB)
22/12/02 08:39:55 INFO dispatcher-BlockManagerMaster BlockManagerInfo: Removed broadcast_48_piece0 on 10.207.18.169:46689 in memory (size: 36.2 KiB, free: 1048.4 MiB)
22/12/02 08:39:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:40:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:40:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:40:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:40:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:40:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:40:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:41:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:41:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:41:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:41:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:41:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:41:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:42:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:42:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:42:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:42:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:42:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:42:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:43:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:43:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:43:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:43:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:43:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:43:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:44:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:44:12 INFO pool-36-thread-1 InProcessLockProvider: Thread pool-36-thread-1 FAILED_TO_ACQUIRE in-process lock.
22/12/02 08:44:12 INFO pool-36-thread-1 LockManager: Retrying to acquire lock...
22/12/02 08:44:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:44:22 INFO pool-36-thread-1 InProcessLockProvider: Thread pool-36-thread-1 ACQUIRING in-process lock.
22/12/02 08:44:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:44:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:44:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:44:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:45:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:45:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:45:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:45:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:45:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:45:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:46:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:46:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:46:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:46:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:46:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:46:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:47:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:47:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:47:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:47:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:47:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:47:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:48:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:48:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:48:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:48:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:48:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:48:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:49:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:49:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:49:22 INFO pool-36-thread-1 InProcessLockProvider: Thread pool-36-thread-1 FAILED_TO_ACQUIRE in-process lock.
22/12/02 08:49:22 INFO pool-36-thread-1 LockManager: Retrying to acquire lock...
22/12/02 08:49:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:49:32 INFO pool-36-thread-1 InProcessLockProvider: Thread pool-36-thread-1 ACQUIRING in-process lock.
22/12/02 08:49:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:49:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:49:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:50:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:50:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:50:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:50:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:50:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:50:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:51:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:51:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:51:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:51:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:51:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:51:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:52:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:52:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:52:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:52:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:52:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:52:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:53:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:53:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:53:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:53:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:53:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:53:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:54:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:54:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:54:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:54:32 INFO pool-36-thread-1 InProcessLockProvider: Thread pool-36-thread-1 FAILED_TO_ACQUIRE in-process lock.
22/12/02 08:54:32 INFO pool-36-thread-1 LockManager: Retrying to acquire lock...
22/12/02 08:54:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:54:42 INFO pool-36-thread-1 InProcessLockProvider: Thread pool-36-thread-1 ACQUIRING in-process lock.
22/12/02 08:54:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:54:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:55:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:55:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:55:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:55:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:55:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:55:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:56:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:56:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:56:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:56:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:56:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:56:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:57:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:57:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:57:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:57:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:57:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:57:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:58:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:58:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:58:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:58:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:58:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:58:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:59:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:59:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:59:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:59:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:59:42 INFO pool-36-thread-1 InProcessLockProvider: Thread pool-36-thread-1 FAILED_TO_ACQUIRE in-process lock.
22/12/02 08:59:42 INFO pool-36-thread-1 LockManager: Retrying to acquire lock...
22/12/02 08:59:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 08:59:52 INFO pool-36-thread-1 InProcessLockProvider: Thread pool-36-thread-1 ACQUIRING in-process lock.
22/12/02 08:59:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:00:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:00:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:00:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:00:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:00:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:00:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:01:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:01:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:01:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:01:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:01:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:01:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:02:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:02:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:02:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:02:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:02:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:02:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:03:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:03:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:03:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:03:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:03:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:03:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:04:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:04:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:04:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:04:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:04:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:04:52 INFO pool-36-thread-1 InProcessLockProvider: Thread pool-36-thread-1 FAILED_TO_ACQUIRE in-process lock.
22/12/02 09:04:52 INFO pool-36-thread-1 LockManager: Retrying to acquire lock...
22/12/02 09:04:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:05:02 INFO pool-36-thread-1 InProcessLockProvider: Thread pool-36-thread-1 ACQUIRING in-process lock.
22/12/02 09:05:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:05:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:05:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:05:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:05:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:05:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:06:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:06:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:06:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:06:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:06:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:06:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:07:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:07:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:07:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:07:37 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:07:47 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:07:57 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:08:07 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:08:17 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
22/12/02 09:08:27 INFO async_compact_thread-thread-1 HoodieAsyncService: Waiting for next instant up to 10 seconds
xushiyan commented 1 year ago

@nsivabalan have you already looked into this? related to rollback and cleaner entanglement under lazy mode.

JuanAmayaBT commented 4 months ago

any news on this? I am using hudi 0.14.1 on aws glue and getting from time to time the following error that seems to be related to this issue: Error waiting for async clean service to finish

    spark_df.write.format('hudi').options(**hudi_final_settings).mode('Append').save()
  File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 966, in save
    self._jwrite.save()
  File "/opt/amazon/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1321, in __call__
    return_value = get_return_value(
  File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 190, in deco
    return f(*a, **kw)
  File "/opt/amazon/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/protocol.py", line 326, in get_return_value
    raise Py4JJavaError(
py4j.protocol.Py4JJavaError: An error occurred while calling o169.save.
: org.apache.hudi.exception.HoodieException: Error waiting for async clean service to finish
    at org.apache.hudi.async.AsyncCleanerService.waitForCompletion(AsyncCleanerService.java:77)
    at org.apache.hudi.client.BaseHoodieTableServiceClient.asyncClean(BaseHoodieTableServiceClient.java:133)
    at org.apache.hudi.client.BaseHoodieWriteClient.autoCleanOnCommit(BaseHoodieWriteClient.java:595)
    at org.apache.hudi.client.BaseHoodieWriteClient.mayBeCleanAndArchive(BaseHoodieWriteClient.java:579)
    at org.apache.hudi.client.BaseHoodieWriteClient.commitStats(BaseHoodieWriteClient.java:248)
    at org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:104)
    at org.apache.hudi.HoodieSparkSqlWriterInternal.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:1081)
    at org.apache.hudi.HoodieSparkSqlWriterInternal.writeInternal(HoodieSparkSqlWriter.scala:520)
    at org.apache.hudi.HoodieSparkSqlWriterInternal.write(HoodieSparkSqlWriter.scala:204)
    at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:121)
    at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:150)
    at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
    at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:103)
    at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
    at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:224)
    at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:114)
    at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$7(SQLExecution.scala:139)
    at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
    at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:224)
    at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:139)
    at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:245)
    at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:138)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
    at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:100)
    at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:96)
    at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:615)
    at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:177)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:615)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:591)
    at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:96)
    at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:83)
    at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:81)
    at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:124)
    at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:860)
    at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:390)
    at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:363)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:247)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:282)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
    at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
    at java.lang.Thread.run(Thread.java:750)
Caused by: java.util.concurrent.ExecutionException: org.apache.hudi.exception.HoodieException: Failed to apply clean commit to metadata
    at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
    at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
    at org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103)
    at org.apache.hudi.async.AsyncCleanerService.waitForCompletion(AsyncCleanerService.java:75)
    ... 57 more
Caused by: org.apache.hudi.exception.HoodieException: Failed to apply clean commit to metadata
    at org.apache.hudi.table.action.BaseActionExecutor.writeTableMetadata(BaseActionExecutor.java:91)
    at org.apache.hudi.table.action.clean.CleanActionExecutor.runClean(CleanActionExecutor.java:227)
    at org.apache.hudi.table.action.clean.CleanActionExecutor.runPendingClean(CleanActionExecutor.java:193)
    at org.apache.hudi.table.action.clean.CleanActionExecutor.execute(CleanActionExecutor.java:263)
    at org.apache.hudi.table.HoodieSparkCopyOnWriteTable.clean(HoodieSparkCopyOnWriteTable.java:291)
    at org.apache.hudi.client.BaseHoodieTableServiceClient.clean(BaseHoodieTableServiceClient.java:763)
    at org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:862)
    at org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:835)
    at org.apache.hudi.async.AsyncCleanerService.lambda$startService$0(AsyncCleanerService.java:55)
    at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    ... 1 more
Caused by: java.lang.IllegalArgumentException
    at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:33)
    at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:618)
    at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:599)
    at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.saveAsComplete(HoodieActiveTimeline.java:224)
    at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.commit(BaseSparkCommitActionExecutor.java:311)
    at org.apache.hudi.table.action.commit.BaseCommitActionExecutor.autoCommit(BaseCommitActionExecutor.java:201)
    at org.apache.hudi.table.action.commit.BaseCommitActionExecutor.commitOnAutoCommit(BaseCommitActionExecutor.java:183)
    at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.updateIndexAndCommitIfNeeded(BaseSparkCommitActionExecutor.java:279)
    at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.execute(BaseSparkCommitActionExecutor.java:184)
    at org.apache.hudi.table.action.deltacommit.SparkUpsertPreppedDeltaCommitActionExecutor.execute(SparkUpsertPreppedDeltaCommitActionExecutor.java:44)
    at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:126)
    at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:88)
    at org.apache.hudi.client.SparkRDDWriteClient.upsertPreppedRecords(SparkRDDWriteClient.java:156)
    at org.apache.hudi.client.SparkRDDWriteClient.upsertPreppedRecords(SparkRDDWriteClient.java:63)
    at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.commitInternal(HoodieBackedTableMetadataWriter.java:1099)
    at org.apache.hudi.metadata.SparkHoodieBackedTableMetadataWriter.commit(SparkHoodieBackedTableMetadataWriter.java:117)
    at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.processAndCommit(HoodieBackedTableMetadataWriter.java:810)
    at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.update(HoodieBackedTableMetadataWriter.java:899)
    at org.apache.hudi.table.action.BaseActionExecutor.writeTableMetadata(BaseActionExecutor.java:86)
    ... 12 more