ashkapsky / BigDatalog

Apache License 2.0
57 stars 20 forks source link

I am writing a pyspark.pandas dataframe to delta table in azure databricks with a compute cluster - 4-7 Workers 256-448 GB Memory 128-224 Cores 1 Driver 32 GB Memory, 16 Cores Runtime 11.3.x-scala2.12. The dataframe has around 20million rows #47

Open alanthom opened 8 months ago

alanthom commented 8 months ago
          Facing the same error. I am writing a pyspark.pandas dataframe to delta:

*** An error occurred while calling o11455.save. : org.apache.spark.SparkException: Job aborted. at org.apache.spark.sql.errors.QueryExecutionErrors$.jobAbortedError(QueryExecutionErrors.scala:882) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$1(FileFormatWriter.scala:334) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:154) at com.databricks.sql.transaction.tahoe.commands.WriteIntoDeltaCommand.run(WriteIntoDeltaCommand.scala:70) at com.databricks.sql.transaction.tahoe.files.TransactionalWriteEdge.$anonfun$writeFiles$11(TransactionalWriteEdge.scala:571) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$8(SQLExecution.scala:250) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:400) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:195) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:985) at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:149) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:350) at com.databricks.sql.transaction.tahoe.files.TransactionalWriteEdge.$anonfun$writeFiles$1(TransactionalWriteEdge.scala:571) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.withOperationTypeTag(DeltaLogging.scala:193) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.withOperationTypeTag$(DeltaLogging.scala:180) at com.databricks.sql.transaction.tahoe.OptimisticTransaction.withOperationTypeTag(OptimisticTransaction.scala:112) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.$anonfun$recordDeltaOperationInternal$2(DeltaLogging.scala:157) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile(DeltaLogging.scala:262) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile$(DeltaLogging.scala:260) at com.databricks.sql.transaction.tahoe.OptimisticTransaction.recordFrameProfile(OptimisticTransaction.scala:112) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.$anonfun$recordDeltaOperationInternal$1(DeltaLogging.scala:156) at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:547) at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:642) at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:663) at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:404) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:147) at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:402) at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:399) at com.databricks.spark.util.PublicDBLogging.withAttributionContext(DatabricksSparkUsageLogger.scala:26) at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:447) at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:432) at com.databricks.spark.util.PublicDBLogging.withAttributionTags(DatabricksSparkUsageLogger.scala:26) at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:637) at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:556) at com.databricks.spark.util.PublicDBLogging.recordOperationWithResultTags(DatabricksSparkUsageLogger.scala:26) at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:547) at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:517) at com.databricks.spark.util.PublicDBLogging.recordOperation(DatabricksSparkUsageLogger.scala:26) at com.databricks.spark.util.PublicDBLogging.recordOperation0(DatabricksSparkUsageLogger.scala:66) at com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:148) at com.databricks.spark.util.UsageLogger.recordOperation(UsageLogger.scala:72) at com.databricks.spark.util.UsageLogger.recordOperation$(UsageLogger.scala:59) at com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:107) at com.databricks.spark.util.UsageLogging.recordOperation(UsageLogger.scala:433) at com.databricks.spark.util.UsageLogging.recordOperation$(UsageLogger.scala:412) at com.databricks.sql.transaction.tahoe.OptimisticTransaction.recordOperation(OptimisticTransaction.scala:112) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperationInternal(DeltaLogging.scala:155) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperation(DeltaLogging.scala:145) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperation$(DeltaLogging.scala:133) at com.databricks.sql.transaction.tahoe.OptimisticTransaction.recordDeltaOperation(OptimisticTransaction.scala:112) at com.databricks.sql.transaction.tahoe.files.TransactionalWriteEdge.$anonfun$recordWriteFilesOperation$1(TransactionalWriteEdge.scala:307) at com.databricks.sql.acl.CheckPermissions$.trusted(CheckPermissions.scala:1837) at com.databricks.sql.transaction.tahoe.files.TransactionalWriteEdge.recordWriteFilesOperation(TransactionalWriteEdge.scala:306) at com.databricks.sql.transaction.tahoe.files.TransactionalWriteEdge.writeFiles(TransactionalWriteEdge.scala:339) at com.databricks.sql.transaction.tahoe.files.TransactionalWriteEdge.writeFiles$(TransactionalWriteEdge.scala:333) at com.databricks.sql.transaction.tahoe.OptimisticTransaction.writeFiles(OptimisticTransaction.scala:112) at com.databricks.sql.transaction.tahoe.files.TransactionalWriteEdge.writeFiles(TransactionalWriteEdge.scala:621) at com.databricks.sql.transaction.tahoe.files.TransactionalWriteEdge.writeFiles$(TransactionalWriteEdge.scala:611) at com.databricks.sql.transaction.tahoe.OptimisticTransaction.writeFiles(OptimisticTransaction.scala:112) at com.databricks.sql.transaction.tahoe.files.TransactionalWrite.writeFiles(TransactionalWrite.scala:226) at com.databricks.sql.transaction.tahoe.files.TransactionalWrite.writeFiles$(TransactionalWrite.scala:223) at com.databricks.sql.transaction.tahoe.OptimisticTransaction.writeFiles(OptimisticTransaction.scala:112) at com.databricks.sql.transaction.tahoe.commands.WriteIntoDelta.write(WriteIntoDelta.scala:368) at com.databricks.sql.transaction.tahoe.commands.WriteIntoDelta.$anonfun$run$2(WriteIntoDelta.scala:111) at com.databricks.sql.transaction.tahoe.commands.WriteIntoDelta.$anonfun$run$2$adapted(WriteIntoDelta.scala:100) at com.databricks.sql.transaction.tahoe.DeltaLog.withNewTransaction(DeltaLog.scala:312) at com.databricks.sql.transaction.tahoe.commands.WriteIntoDelta.$anonfun$run$1(WriteIntoDelta.scala:100) at com.databricks.sql.acl.CheckPermissions$.trusted(CheckPermissions.scala:1837) at com.databricks.sql.transaction.tahoe.commands.WriteIntoDelta.run(WriteIntoDelta.scala:99) at com.databricks.sql.transaction.tahoe.sources.DeltaDataSource.createRelation(DeltaDataSource.scala:168) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:49) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:80) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:78) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:89) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$2(QueryExecution.scala:247) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$8(SQLExecution.scala:250) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:400) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:195) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:985) at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:149) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:350) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$1(QueryExecution.scala:247) at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$withMVTagsIfNecessary(QueryExecution.scala:232) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:245) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:238) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:512) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:99) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:512) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:31) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:298) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:294) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:488) at org.apache.spark.sql.execution.QueryExecution.$anonfun$eagerlyExecuteCommands$1(QueryExecution.scala:238) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:354) at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:238) at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:192) at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:183) at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:274) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:965) at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:430) at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:339) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:251) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at py4j.Gateway.invoke(Gateway.java:306) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195) at py4j.ClientServerConnection.run(ClientServerConnection.java:115) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 43 in stage 360.0 failed 4 times, most recent failure: Lost task 43.3 in stage 360.0 (TID 9912) (172.16.177.134 executor 10): org.apache.spark.SparkException: Checkpoint block rdd_1208_8 not found! Either the executor that originally checkpointed this partition is no longer alive, or the original RDD is unpersisted. If this problem persists, you may consider using rdd.checkpoint() instead, which is slower than local checkpointing but more fault-tolerant.
at org.apache.spark.errors.SparkCoreErrors$.checkpointRDDBlockIdNotFoundError(SparkCoreErrors.scala:82) at org.apache.spark.rdd.LocalCheckpointRDD.compute(LocalCheckpointRDD.scala:61) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:408) at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:423) at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1559) at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1486) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1550) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1369) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:421) at org.apache.spark.rdd.RDD.iterator(RDD.scala:372) at org.apache.spark.rdd.ZippedWithIndexRDD.compute(ZippedWithIndexRDD.scala:71) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:122) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$3(ShuffleMapTask.scala:81) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$1(ShuffleMapTask.scala:81) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.doRunTask(Task.scala:174) at org.apache.spark.scheduler.Task.$anonfun$run$4(Task.scala:137) at com.databricks.unity.EmptyHandle$.runWithAndClose(UCSHandle.scala:126) at org.apache.spark.scheduler.Task.$anonfun$run$1(Task.scala:137) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.scheduler.Task.run(Task.scala:96) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$13(Executor.scala:902) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1697) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:905) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:760) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750)

Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:3414) at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:3336) at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:3325) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:3325) at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1433) at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1433) at scala.Option.foreach(Option.scala:407) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1433) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:3625) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:3563) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:3551) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:51) Caused by: org.apache.spark.SparkException: Checkpoint block rdd_1208_8 not found! Either the executor that originally checkpointed this partition is no longer alive, or the original RDD is unpersisted. If this problem persists, you may consider using rdd.checkpoint() instead, which is slower than local checkpointing but more fault-tolerant.
at org.apache.spark.errors.SparkCoreErrors$.checkpointRDDBlockIdNotFoundError(SparkCoreErrors.scala:82) at org.apache.spark.rdd.LocalCheckpointRDD.compute(LocalCheckpointRDD.scala:61) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:408) at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:423) at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1559) at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1486) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1550) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1369) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:421) at org.apache.spark.rdd.RDD.iterator(RDD.scala:372) at org.apache.spark.rdd.ZippedWithIndexRDD.compute(ZippedWithIndexRDD.scala:71) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:122) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$3(ShuffleMapTask.scala:81) at com.data

WARNING: max output size exceeded, skipping output.

Logging.scala:193) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.withOperationTypeTag$(DeltaLogging.scala:180) at com.databricks.sql.transaction.tahoe.OptimisticTransaction.withOperationTypeTag(OptimisticTransaction.scala:112) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.$anonfun$recordDeltaOperationInternal$2(DeltaLogging.scala:157) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile(DeltaLogging.scala:262) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile$(DeltaLogging.scala:260) at com.databricks.sql.transaction.tahoe.OptimisticTransaction.recordFrameProfile(OptimisticTransaction.scala:112) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.$anonfun$recordDeltaOperationInternal$1(DeltaLogging.scala:156) at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:547) at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:642) at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:663) at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:404) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:147) at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:402) at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:399) at com.databricks.spark.util.PublicDBLogging.withAttributionContext(DatabricksSparkUsageLogger.scala:26) at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:447) at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:432) at com.databricks.spark.util.PublicDBLogging.withAttributionTags(DatabricksSparkUsageLogger.scala:26) at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:637) at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:556) at com.databricks.spark.util.PublicDBLogging.recordOperationWithResultTags(DatabricksSparkUsageLogger.scala:26) at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:547) at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:517) at com.databricks.spark.util.PublicDBLogging.recordOperation(DatabricksSparkUsageLogger.scala:26) at com.databricks.spark.util.PublicDBLogging.recordOperation0(DatabricksSparkUsageLogger.scala:66) at com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:148) at com.databricks.spark.util.UsageLogger.recordOperation(UsageLogger.scala:72) at com.databricks.spark.util.UsageLogger.recordOperation$(UsageLogger.scala:59) at com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:107) at com.databricks.spark.util.UsageLogging.recordOperation(UsageLogger.scala:433) at com.databricks.spark.util.UsageLogging.recordOperation$(UsageLogger.scala:412) at com.databricks.sql.transaction.tahoe.OptimisticTransaction.recordOperation(OptimisticTransaction.scala:112) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperationInternal(DeltaLogging.scala:155) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperation(DeltaLogging.scala:145) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperation$(DeltaLogging.scala:133) at com.databricks.sql.transaction.tahoe.OptimisticTransaction.recordDeltaOperation(OptimisticTransaction.scala:112) at com.databricks.sql.transaction.tahoe.files.TransactionalWriteEdge.$anonfun$recordWriteFilesOperation$1(TransactionalWriteEdge.scala:307) at com.databricks.sql.acl.CheckPermissions$.trusted(CheckPermissions.scala:1837) at com.databricks.sql.transaction.tahoe.files.TransactionalWriteEdge.recordWriteFilesOperation(TransactionalWriteEdge.scala:306) at com.databricks.sql.transaction.tahoe.files.TransactionalWriteEdge.writeFiles(TransactionalWriteEdge.scala:339) at com.databricks.sql.transaction.tahoe.files.TransactionalWriteEdge.writeFiles$(TransactionalWriteEdge.scala:333) at com.databricks.sql.transaction.tahoe.OptimisticTransaction.writeFiles(OptimisticTransaction.scala:112) at com.databricks.sql.transaction.tahoe.files.TransactionalWriteEdge.writeFiles(TransactionalWriteEdge.scala:621) at com.databricks.sql.transaction.tahoe.files.TransactionalWriteEdge.writeFiles$(TransactionalWriteEdge.scala:611) at com.databricks.sql.transaction.tahoe.OptimisticTransaction.writeFiles(OptimisticTransaction.scala:112) at com.databricks.sql.transaction.tahoe.files.TransactionalWrite.writeFiles(TransactionalWrite.scala:226) at com.databricks.sql.transaction.tahoe.files.TransactionalWrite.writeFiles$(TransactionalWrite.scala:223) at com.databricks.sql.transaction.tahoe.OptimisticTransaction.writeFiles(OptimisticTransaction.scala:112) at com.databricks.sql.transaction.tahoe.commands.WriteIntoDelta.write(WriteIntoDelta.scala:368) at com.databricks.sql.transaction.tahoe.commands.WriteIntoDelta.$anonfun$run$2(WriteIntoDelta.scala:111) at com.databricks.sql.transaction.tahoe.commands.WriteIntoDelta.$anonfun$run$2$adapted(WriteIntoDelta.scala:100) at com.databricks.sql.transaction.tahoe.DeltaLog.withNewTransaction(DeltaLog.scala:312) at com.databricks.sql.transaction.tahoe.commands.WriteIntoDelta.$anonfun$run$1(WriteIntoDelta.scala:100) at com.databricks.sql.acl.CheckPermissions$.trusted(CheckPermissions.scala:1837) at com.databricks.sql.transaction.tahoe.commands.WriteIntoDelta.run(WriteIntoDelta.scala:99) at com.databricks.sql.transaction.tahoe.sources.DeltaDataSource.createRelation(DeltaDataSource.scala:168) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:49) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:80) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:78) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:89) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$2(QueryExecution.scala:247) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$8(SQLExecution.scala:250) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:400) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:195) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:985) at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:149) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:350) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$1(QueryExecution.scala:247) at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$withMVTagsIfNecessary(QueryExecution.scala:232) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:245) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:238) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:512) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:99) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:512) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:31) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:298) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:294) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:488) at org.apache.spark.sql.execution.QueryExecution.$anonfun$eagerlyExecuteCommands$1(QueryExecution.scala:238) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:354) at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:238) at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:192) at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:183) at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:274) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:965) at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:430) at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:339) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:251) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at py4j.Gateway.invoke(Gateway.java:306) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195) at py4j.ClientServerConnection.run(ClientServerConnection.java:115) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 43 in stage 360.0 failed 4 times, most recent failure: Lost task 43.3 in stage 360.0 (TID 9912) (172.16.177.134 executor 10): org.apache.spark.SparkException: Checkpoint block rdd_1208_8 not found! Either the executor that originally checkpointed this partition is no longer alive, or the original RDD is unpersisted. If this problem persists, you may consider using rdd.checkpoint() instead, which is slower than local checkpointing but more fault-tolerant.
at org.apache.spark.errors.SparkCoreErrors$.checkpointRDDBlockIdNotFoundError(SparkCoreErrors.scala:82) at org.apache.spark.rdd.LocalCheckpointRDD.compute(LocalCheckpointRDD.scala:61) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:408) at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:423) at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1559) at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1486) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1550) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1369) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:421) at org.apache.spark.rdd.RDD.iterator(RDD.scala:372) at org.apache.spark.rdd.ZippedWithIndexRDD.compute(ZippedWithIndexRDD.scala:71) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:122) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$3(ShuffleMapTask.scala:81) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$1(ShuffleMapTask.scala:81) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.doRunTask(Task.scala:174) at org.apache.spark.scheduler.Task.$anonfun$run$4(Task.scala:137) at com.databricks.unity.EmptyHandle$.runWithAndClose(UCSHandle.scala:126) at org.apache.spark.scheduler.Task.$anonfun$run$1(Task.scala:137) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.scheduler.Task.run(Task.scala:96) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$13(Executor.scala:902) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1697) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:905) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:760) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750)

Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:3414) at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:3336) at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:3325) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:3325) at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1433) at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1433) at scala.Option.foreach(Option.scala:407) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1433) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:3625) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:3563) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:3551) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:51) Caused by: org.apache.spark.SparkException: Checkpoint block rdd_1208_8 not found! Either the executor that originally checkpointed this partition is no longer alive, or the original RDD is unpersisted. If this problem persists, you may consider using rdd.checkpoint() instead, which is slower than local checkpointing but more fault-tolerant.
at org.apache.spark.errors.SparkCoreErrors$.checkpointRDDBlockIdNotFoundError(SparkCoreErrors.scala:82) at org.apache.spark.rdd.LocalCheckpointRDD.compute(LocalCheckpointRDD.scala:61) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:408) at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:423) at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1559) at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1486) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1550) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1369) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:421) at org.apache.spark.rdd.RDD.iterator(RDD.scala:372) at org.apache.spark.rdd.ZippedWithIndexRDD.compute(ZippedWithIndexRDD.scala:71) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:122) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:410) at org.apache.spark.rdd.RDD.iterator(RDD.scala:374) at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$3(ShuffleMapTask.scala:81) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$1(ShuffleMapTask.scala:81) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.doRunTask(Task.scala:174) at org.apache.spark.scheduler.Task.$anonfun$run$4(Task.scala:137) at com.databricks.unity.EmptyHandle$.runWithAndClose(UCSHandle.scala:126) at org.apache.spark.scheduler.Task.$anonfun$run$1(Task.scala:137) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.scheduler.Task.run(Task.scala:96) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$13(Executor.scala:902) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1697) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:905) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:760) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750)

shape - (191365764, 5)

Originally posted by @alanthom in https://github.com/ashkapsky/BigDatalog/issues/9#issuecomment-2013456790

alanthom commented 8 months ago

Occasional run is successful, but randomly it goes into the above error