Here the part of the output with the first error message (see the attachment for the complete log)
17/10/10 16:35:57 INFO Recursion: Fixed Point Iteration # 2, time: 9170ms
17/10/10 16:35:57 INFO DAGScheduler: Submitting FixedPointResultStage 3 (SetRDD.diffRDD SetRDD[32] at RDD at SetRDD.scala:29), which has no missing parents
17/10/10 16:35:57 INFO MemoryStore: Block broadcast_5 stored as values in memory (estimated size 16.9 KB, free 510.0 MB)
17/10/10 16:35:57 INFO MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 8.7 KB, free 510.1 MB)
17/10/10 16:35:57 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on localhost:43953 (size: 8.7 KB, free: 1135.5 KB)
17/10/10 16:35:57 INFO SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:1096
17/10/10 16:35:57 INFO DAGScheduler: Submitting 200 missing tasks from FixedPointResultStage 3 (SetRDD.diffRDD SetRDD[32] at RDD at SetRDD.scala:29)
17/10/10 16:35:57 INFO TaskSchedulerImpl: Adding task set 3.0 with 200 tasks
17/10/10 16:35:57 INFO TaskSetManager: Starting task 121.0 in stage 3.0 (TID 256, localhost, partition 121,PROCESS_LOCAL, 2343 bytes)
17/10/10 16:35:57 INFO TaskSetManager: Starting task 123.0 in stage 3.0 (TID 257, localhost, partition 123,PROCESS_LOCAL, 2343 bytes)
17/10/10 16:35:57 INFO TaskSetManager: Starting task 124.0 in stage 3.0 (TID 258, localhost, partition 124,PROCESS_LOCAL, 2343 bytes)
17/10/10 16:35:57 INFO TaskSetManager: Starting task 125.0 in stage 3.0 (TID 259, localhost, partition 125,PROCESS_LOCAL, 2343 bytes)
17/10/10 16:35:57 INFO Executor: Running task 121.0 in stage 3.0 (TID 256)
17/10/10 16:35:57 INFO Executor: Running task 123.0 in stage 3.0 (TID 257)
17/10/10 16:35:57 INFO Executor: Running task 124.0 in stage 3.0 (TID 258)
17/10/10 16:35:57 INFO Executor: Running task 125.0 in stage 3.0 (TID 259)
17/10/10 16:35:57 INFO CacheManager: Partition rdd_31_123 not found, computing it
17/10/10 16:35:57 INFO CacheManager: Partition rdd_31_125 not found, computing it
17/10/10 16:35:57 INFO CacheManager: Partition rdd_31_121 not found, computing it
17/10/10 16:35:57 INFO CacheManager: Partition rdd_27_123 not found, computing it
17/10/10 16:35:57 INFO CacheManager: Partition rdd_27_121 not found, computing it
17/10/10 16:35:57 INFO BlockManager: Found block rdd_17_123 locally
17/10/10 16:35:57 INFO BlockManager: Found block rdd_21_123 locally
17/10/10 16:35:57 INFO CacheManager: Partition rdd_17_121 not found, computing it
17/10/10 16:35:57 INFO SetRDDHashSetPartition: Union set size 0 for rdd 18 took 0 ms
17/10/10 16:35:57 ERROR Executor: Exception in task 121.0 in stage 3.0 (TID 256)
org.apache.spark.SparkException: Checkpoint block rdd_17_121 not found! Either the executor that originally checkpointed this partition is no longer alive, or the original RDD is unpersisted. If this problem persists, you may consider using `rdd.checkpoint()` or `rdd.localcheckpoint()` instead, which are slower than memory checkpointing but more fault-tolerant.
at org.apache.spark.rdd.MemoryCheckpointRDD.compute(MemoryCheckpointRDD.scala:43)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:304)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
at org.apache.spark.rdd.ZippedPartitionsRDD2.compute(ZippedPartitionsRDD.scala:88)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
at org.apache.spark.rdd.ZippedPartitionsRDD2.compute(ZippedPartitionsRDD.scala:88)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
at edu.ucla.cs.wis.bigdatalog.spark.execution.setrdd.SetRDD.compute(SetRDD.scala:108)
at edu.ucla.cs.wis.bigdatalog.spark.execution.setrdd.SetRDD.computeOrReadCheckpoint(SetRDD.scala:104)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.fixedpoint.FixedPointResultTask.runTask(FixedPointResultTask.scala:54)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
17/10/10 16:35:57 INFO CacheManager: Partition rdd_27_125 not found, computing it
17/10/10 16:35:57 INFO BlockManager: Found block rdd_17_125 locally
17/10/10 16:35:57 INFO BlockManager: Found block rdd_21_125 locally
17/10/10 16:35:57 INFO SetRDDHashSetPartition: Union set size 0 for rdd 18 took 0 ms
17/10/10 16:35:57 INFO MemoryStore: 1 blocks selected for dropping
17/10/10 16:35:57 INFO BlockManager: Dropping block rdd_17_124 from memory
17/10/10 16:35:57 INFO BlockManagerInfo: Removed rdd_17_124 on localhost:43953 in memory (size: 1701.1 KB, free: 2.8 MB)
17/10/10 16:35:57 INFO MemoryStore: 1 blocks selected for dropping
17/10/10 16:35:57 INFO BlockManager: Dropping block rdd_11_125 from memory
17/10/10 16:35:57 INFO BlockManagerInfo: Removed rdd_11_125 on localhost:43953 in memory (size: 1701.1 KB, free: 4.4 MB)
17/10/10 16:35:57 INFO MemoryStore: 1 blocks selected for dropping
17/10/10 16:35:57 INFO BlockManager: Dropping block rdd_15_124 from memory
17/10/10 16:35:57 INFO BlockManagerInfo: Removed rdd_15_124 on localhost:43953 in memory (size: 1701.1 KB, free: 6.1 MB)
I tried to use the recursion, but it fails with a lot of error messages. (See #3 for more details on how I run the program)
content of arcs
content of bigdatalog.deal
command
error message
Here the part of the output with the first error message (see the attachment for the complete log)
bigdatalog.log.zip