apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.2k stars 1.21k forks source link

Improve Retention Manager side segment lineage clean up #13171

Open snleee opened 2 weeks ago

snleee commented 2 weeks ago

While we clean up the segment lineage & deleting segments in the retention manager, we observe that it frequently fails when a lot of segment upload is happening in the controller.

Segment upload path is already serialized using a synchronized block (refer PinotHelixResourceManager.assignTableSegment()) based on the table lock.

On the other hand, the retention manager will try to update without grabbing the lock so it frequently fails to update idealstate.

Potential Improvements:

  1. Use better retry policy for updating segment lineage (refer DEFAULT_TABLE_IDEALSTATES_UPDATE_RETRY_POLICY)
  2. Grab the table lock for idealstate update happening in the retention manager. (refer PinotHelixResourceManager.assignTableSegment)

This is some example where we observe that the idealstate update failed during segment delete from the retention manager.

java.lang.RuntimeException: Caught exception while updating ideal state for resource: tableXXX_OFFLINE
        at org.apache.pinot.common.utils.helix.HelixHelper.updateIdealState(HelixHelper.java:203) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.common.utils.helix.HelixHelper.updateIdealState(HelixHelper.java:232) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.common.utils.helix.HelixHelper.removeSegmentsFromIdealState(HelixHelper.java:503) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.PinotHelixResourceManager.deleteSegments(PinotHelixResourceManager.java:1030) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2
913878a31071ec]
        at org.apache.pinot.controller.helix.core.PinotHelixResourceManager.deleteSegments(PinotHelixResourceManager.java:1013) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2
913878a31071ec]
        at org.apache.pinot.controller.helix.core.retention.RetentionManager.lambda$manageSegmentLineageCleanupForTable$0(RetentionManager.java:223) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a
0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.spi.utils.retry.BaseRetryPolicy.attempt(BaseRetryPolicy.java:58) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.retention.RetentionManager.manageSegmentLineageCleanupForTable(RetentionManager.java:195) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c82
4774f3ceb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.retention.RetentionManager.processTable(RetentionManager.java:86) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071
ec]
        at org.apache.pinot.controller.helix.core.periodictask.ControllerPeriodicTask.processTable(ControllerPeriodicTask.java:145) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ce
b1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.periodictask.ControllerPeriodicTask.processTables(ControllerPeriodicTask.java:118) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3c
eb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.periodictask.ControllerPeriodicTask.runTask(ControllerPeriodicTask.java:81) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a291
3878a31071ec]
        at org.apache.pinot.core.periodictask.BasePeriodicTask.run(BasePeriodicTask.java:150) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.core.periodictask.BasePeriodicTask.run(BasePeriodicTask.java:135) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.core.periodictask.PeriodicTaskScheduler.lambda$start$0(PeriodicTaskScheduler.java:87) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec
]
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572) ~[?:?]
        at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:358) ~[?:?]
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[?:?]
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
        at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
Caused by: org.apache.pinot.spi.utils.retry.AttemptsExceededException: Operation failed after 5 attempts
        at org.apache.pinot.spi.utils.retry.BaseRetryPolicy.attempt(BaseRetryPolicy.java:65) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.common.utils.helix.HelixHelper.updateIdealState(HelixHelper.java:104) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        ... 20 more

Also, we observe that updating segment lineage znode also fails sometimes:

2024/05/16 01:59:39.041 ERROR [RetentionManager] [pool-20-thread-4] Failed to clean up the segment lineage. (tableName = tableXXX_OFFLINE)
org.apache.pinot.spi.utils.retry.AttemptsExceededException: Operation failed after 5 attempts
        at org.apache.pinot.spi.utils.retry.BaseRetryPolicy.attempt(BaseRetryPolicy.java:65) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.retention.RetentionManager.manageSegmentLineageCleanupForTable(RetentionManager.java:195) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.retention.RetentionManager.processTable(RetentionManager.java:86) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.periodictask.ControllerPeriodicTask.processTable(ControllerPeriodicTask.java:145) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.periodictask.ControllerPeriodicTask.processTables(ControllerPeriodicTask.java:118) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.periodictask.ControllerPeriodicTask.runTask(ControllerPeriodicTask.java:81) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.core.periodictask.BasePeriodicTask.run(BasePeriodicTask.java:150) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.core.periodictask.BasePeriodicTask.run(BasePeriodicTask.java:135) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.core.periodictask.PeriodicTaskScheduler.lambda$start$0(PeriodicTaskScheduler.java:87) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572) ~[?:?]
        at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:358) ~[?:?]
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[?:?]
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
        at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
2024/05/16 01:59:39.041 ERROR [ControllerPeriodicTask] [pool-20-thread-4] Caught exception while processing table: tableXXX_OFFLINE in task: RetentionManager
java.lang.RuntimeException: Failed to clean up the segment lineage. (tableName = tableXXX_OFFLINE)
        at org.apache.pinot.controller.helix.core.retention.RetentionManager.manageSegmentLineageCleanupForTable(RetentionManager.java:237) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.retention.RetentionManager.processTable(RetentionManager.java:86) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.periodictask.ControllerPeriodicTask.processTable(ControllerPeriodicTask.java:145) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.periodictask.ControllerPeriodicTask.processTables(ControllerPeriodicTask.java:118) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.periodictask.ControllerPeriodicTask.runTask(ControllerPeriodicTask.java:81) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.core.periodictask.BasePeriodicTask.run(BasePeriodicTask.java:150) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.core.periodictask.BasePeriodicTask.run(BasePeriodicTask.java:135) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.core.periodictask.PeriodicTaskScheduler.lambda$start$0(PeriodicTaskScheduler.java:87) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572) ~[?:?]
        at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:358) ~[?:?]
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[?:?]
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
        at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
Caused by: org.apache.pinot.spi.utils.retry.AttemptsExceededException: Operation failed after 5 attempts
        at org.apache.pinot.spi.utils.retry.BaseRetryPolicy.attempt(BaseRetryPolicy.java:65) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        at org.apache.pinot.controller.helix.core.retention.RetentionManager.manageSegmentLineageCleanupForTable(RetentionManager.java:195) ~[startree-pinot-all-1.2.0-ST.10.1-jar-with-dependencies.jar:1.2.0-ST.10.1-8711a0aa760c824774f3ceb1a2913878a31071ec]
        ... 13 more
snleee commented 2 weeks ago

@aishikbh Can you try to tackle this?

aishikbh commented 2 weeks ago

Sure, I will take care of this. Thanks!