Closed februaryfang closed 1 year ago
Dear Fang,
Thank you for the questions. Our project didn't use customized GATK-PathSeq database, so I am sorry to say that I'm not able to provide assistance for this question. For technical questions related to GATK-PathSeq, I would suggest you to contact GATK-PathSeq team for help. Good luck with your analysis!
Best regards, Hanrui
Dear Hanrui,
PathSeqPipelineSpark is a comprehensive module, so I tried to analyze it step by step, hoping to discover the reasons for the lack of results. After analyzing PathSeqFilterSpark, I obtained a statistical file with the following results.
PRIMARY_READS READS_AFTER_PREALIGNED_HOST_FILTER READS_AFTER_QUALITY_AND_COMPLEXITY_FILTER READS_AFTER_HOST_FILTER READS_AFTER_DEDUPLICATION FINAL_PAIRED_READS FINAL_UNPAIRED_READS FINAL_TOTAL_READS LOW_QUALITY_OR_LOW_COMPLEXITY_READS_FILTERED HOST_READS_FILTERED DUPLICATE_READS_FILTERED 2196465 2196465 0 0 0 0 0 0 2196465 0 0
Why are all readss marked as LOW QUALITY OR LOW COMPLEXITY READS FILTERED ? The above results are from the example code 'patient_samples_16s_pipeline.sh'. Is the threshold in the code the true threshold for literature data?
Dear Fang,
Have you tried the database provided by GATK-Pathseq? I'm not familiar with the custom database, and haven't tested PathSeqFilterSpark on my own. But based on my understanding, a prepared database might be helpful in the debugging process.
Best regards, Hanrui
hello, I followed the Visium_pipeline.sh in Part 1 to analyze CRC_16 10x Visium spatial transcriptomic data. I did not download the data from the database on the GATK official website. But I prepared the database according to the tutorial [https://gatk.broadinstitute.org/hc/en-us/articles/360035889911--How-to-Run-the-Pathseq-pipeline] by myself. The analysis has no results, and I don't know the reason for the lack of results.
Using GATK jar /mnt/icfs/work/singlecelldevelopment/software/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx750g -jar /mnt/icfs/work/singlecelldevelopment/software/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar PathSeqPipelineSpark --input CRC_16/outs/possorted_genome_bam.bam --filter-bwa-image hsa_GRCh38/genome.fa.img --kmer-file hsa_GRCh38/genome.hss --min-clipped-read-length 60 --microbe-dict 16SrRNA/bacteria.16SrRNA.dict --microbe-bwa-image 16SrRNA/bacteria.16SrRNA.fa.img --taxonomy-file 16SrRNA/16SrRNA.db --output pathseq/CRC_16.pathseq.complete.bam --scores-output pathseq/CRC_16.pathseq.complete.csv --is-host-aligned false --filter-duplicates false --min-score-identity .7 --tmp-dir pathseq/tmp 13:19:23.776 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/icfs/work/singlecelldevelopment/software/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so 13:19:28.982 INFO PathSeqPipelineSpark - ------------------------------------------------------------ 13:19:28.982 INFO PathSeqPipelineSpark - The Genome Analysis Toolkit (GATK) v4.3.0.0 13:19:28.982 INFO PathSeqPipelineSpark - For support and documentation go to https://software.broadinstitute.org/gatk/ 13:19:28.983 INFO PathSeqPipelineSpark - Executing as singlecellproject@d01.capitalbiotech.local on Linux v3.10.0-514.16.1.el7.x86_64 amd64 13:19:28.983 INFO PathSeqPipelineSpark - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_151-b12 13:19:28.983 INFO PathSeqPipelineSpark - Start Date/Time: May 23, 2023 1:19:23 PM CST 13:19:28.983 INFO PathSeqPipelineSpark - ------------------------------------------------------------ 13:19:28.983 INFO PathSeqPipelineSpark - ------------------------------------------------------------ 13:19:28.984 INFO PathSeqPipelineSpark - HTSJDK Version: 3.0.1 13:19:28.984 INFO PathSeqPipelineSpark - Picard Version: 2.27.5 13:19:28.984 INFO PathSeqPipelineSpark - Built for Spark Version: 2.4.5 13:19:28.984 INFO PathSeqPipelineSpark - HTSJDK Defaults.COMPRESSION_LEVEL : 2 13:19:28.984 INFO PathSeqPipelineSpark - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false 13:19:28.984 INFO PathSeqPipelineSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true 13:19:28.984 INFO PathSeqPipelineSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false 13:19:28.985 INFO PathSeqPipelineSpark - Deflater: IntelDeflater 13:19:28.985 INFO PathSeqPipelineSpark - Inflater: IntelInflater 13:19:28.985 INFO PathSeqPipelineSpark - GCS max retries/reopens: 20 13:19:28.985 INFO PathSeqPipelineSpark - Requester pays: disabled 13:19:28.985 INFO PathSeqPipelineSpark - Initializing engine 13:19:28.985 INFO PathSeqPipelineSpark - Done initializing engine Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 23/05/23 13:19:29 INFO SparkContext: Running Spark version 2.4.5 23/05/23 13:19:29 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 23/05/23 13:19:29 INFO SparkContext: Submitted application: PathSeqPipelineSpark 23/05/23 13:19:29 INFO SecurityManager: Changing view acls to: singlecellproject 23/05/23 13:19:29 INFO SecurityManager: Changing modify acls to: singlecellproject 23/05/23 13:19:29 INFO SecurityManager: Changing view acls groups to: 23/05/23 13:19:29 INFO SecurityManager: Changing modify acls groups to: 23/05/23 13:19:29 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(singlecellproject); groups with view permissions: Set(); users with modify permissions: Set(singlecellproject); groups with modify permissions: Set() 23/05/23 13:19:29 INFO Utils: Successfully started service 'sparkDriver' on port 40471. 23/05/23 13:19:29 INFO SparkEnv: Registering MapOutputTracker 23/05/23 13:19:29 INFO SparkEnv: Registering BlockManagerMaster 23/05/23 13:19:29 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 23/05/23 13:19:29 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 23/05/23 13:19:29 INFO DiskBlockManager: Created local directory at pathseq/tmp/blockmgr-11fec4b1-0808-4f7e-9ab9-a87799853aee 23/05/23 13:19:29 INFO MemoryStore: MemoryStore started with capacity 399.8 GB 23/05/23 13:19:29 INFO SparkEnv: Registering OutputCommitCoordinator 23/05/23 13:19:30 INFO Utils: Successfully started service 'SparkUI' on port 4040. 23/05/23 13:19:30 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://d01.capitalbiotech.local:4040 23/05/23 13:19:30 INFO Executor: Starting executor ID driver on host localhost 23/05/23 13:19:30 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41352. 23/05/23 13:19:30 INFO NettyBlockTransferService: Server created on d01.capitalbiotech.local:41352 23/05/23 13:19:30 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 23/05/23 13:19:30 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, d01.capitalbiotech.local, 41352, None) 23/05/23 13:19:30 INFO BlockManagerMasterEndpoint: Registering block manager d01.capitalbiotech.local:41352 with 399.8 GB RAM, BlockManagerId(driver, d01.capitalbiotech.local, 41352, None) 23/05/23 13:19:30 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, d01.capitalbiotech.local, 41352, None) 23/05/23 13:19:30 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, d01.capitalbiotech.local, 41352, None) 13:19:30.590 INFO PathSeqPipelineSpark - Spark verbosity set to INFO (see --spark-verbosity argument) 23/05/23 13:19:30 INFO GoogleHadoopFileSystemBase: GHFS version: 1.9.4-hadoop3 23/05/23 13:19:31 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 392.4 KB, free 399.8 GB) 23/05/23 13:19:31 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 35.5 KB, free 399.8 GB) 23/05/23 13:19:31 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on d01.capitalbiotech.local:41352 (size: 35.5 KB, free: 399.8 GB) 23/05/23 13:19:31 INFO SparkContext: Created broadcast 0 from newAPIHadoopFile at PathSplitSource.java:96 13:19:32.136 WARN PathSeqPipelineSpark - --is-host-aligned is false but there are one or more sequences in the BAM header 23/05/23 13:19:32 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 392.4 KB, free 399.8 GB) 23/05/23 13:19:32 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 35.5 KB, free 399.8 GB) 23/05/23 13:19:32 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on d01.capitalbiotech.local:41352 (size: 35.5 KB, free: 399.8 GB) 23/05/23 13:19:32 INFO SparkContext: Created broadcast 1 from newAPIHadoopFile at PathSplitSource.java:96 23/05/23 13:19:32 INFO FileInputFormat: Total input files to process : 1 23/05/23 13:19:32 INFO SparkContext: Starting job: count at PathSeqPipelineSpark.java:244 23/05/23 13:19:32 INFO DAGScheduler: Registering RDD 25 (mapToPair at PSFilter.java:128) as input to shuffle 2 23/05/23 13:19:32 INFO DAGScheduler: Registering RDD 29 (mapToPair at PSFilter.java:128) as input to shuffle 1 23/05/23 13:19:32 INFO DAGScheduler: Registering RDD 34 (mapToPair at PSFilter.java:128) as input to shuffle 0 23/05/23 13:19:32 INFO DAGScheduler: Got job 0 (count at PathSeqPipelineSpark.java:244) with 244 output partitions 23/05/23 13:19:32 INFO DAGScheduler: Final stage: ResultStage 3 (count at PathSeqPipelineSpark.java:244) 23/05/23 13:19:32 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 2) 23/05/23 13:19:32 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 2) 23/05/23 13:19:32 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[25] at mapToPair at PSFilter.java:128), which has no missing parents 23/05/23 13:19:32 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 276.6 KB, free 399.8 GB) 23/05/23 13:19:32 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 120.5 KB, free 399.8 GB) 23/05/23 13:19:32 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on d01.capitalbiotech.local:41352 (size: 120.5 KB, free: 399.8 GB) 23/05/23 13:19:32 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1163 23/05/23 13:19:32 INFO DAGScheduler: Submitting 244 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[25] at mapToPair at PSFilter.java:128) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 23/05/23 13:19:32 INFO TaskSchedulerImpl: Adding task set 0.0 with 244 tasks 23/05/23 13:19:32 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 8030 bytes) 23/05/23 13:19:32 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 8030 bytes) ... 23/05/23 13:19:32 INFO TaskSetManager: Starting task 126.0 in stage 0.0 (TID 126, localhost, executor driver, partition 126, PROCESS_LOCAL, 8030 bytes) 23/05/23 13:19:32 INFO TaskSetManager: Starting task 127.0 in stage 0.0 (TID 127, localhost, executor driver, partition 127, PROCESS_LOCAL, 8030 bytes) 23/05/23 13:19:32 INFO Executor: Running task 5.0 in stage 0.0 (TID 5) ... 23/05/23 13:19:33 INFO Executor: Running task 127.0 in stage 0.0 (TID 127) 23/05/23 13:19:33 INFO BlockManagerInfo: Removed broadcast_0_piece0 on d01.capitalbiotech.local:41352 in memory (size: 35.5 KB, free: 399.8 GB) 23/05/23 13:19:34 INFO NewHadoopRDD: Input split: file:spaceranger_count/CRC_16/outs/possorted_genome_bam.bam:2080374784+33554432 ... 23/05/23 13:19:51 INFO Executor: Finished task 112.0 in stage 0.0 (TID 112). 1128 bytes result sent to driver 23/05/23 13:19:51 INFO TaskSetManager: Starting task 132.0 in stage 0.0 (TID 132, localhost, executor driver, partition 132, PROCESS_LOCAL, 8030 bytes) 23/05/23 13:19:51 INFO Executor: Running task 132.0 in stage 0.0 (TID 132) 23/05/23 13:19:51 INFO TaskSetManager: Finished task 123.0 in stage 0.0 (TID 123) in 18852 ms on localhost (executor driver) (1/244) ... 23/05/23 13:20:06 INFO Executor: Finished task 239.0 in stage 0.0 (TID 239). 1128 bytes result sent to driver 23/05/23 13:20:06 INFO TaskSetManager: Finished task 239.0 in stage 0.0 (TID 239) in 8347 ms on localhost (executor driver) (244/244) 23/05/23 13:20:06 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 23/05/23 13:20:06 INFO DAGScheduler: ShuffleMapStage 0 (mapToPair at PSFilter.java:128) finished in 33.663 s 23/05/23 13:20:06 INFO DAGScheduler: looking for newly runnable stages 23/05/23 13:20:06 INFO DAGScheduler: running: Set() 23/05/23 13:20:06 INFO DAGScheduler: waiting: Set(ShuffleMapStage 1, ShuffleMapStage 2, ResultStage 3) 23/05/23 13:20:06 INFO DAGScheduler: failed: Set() 23/05/23 13:20:06 INFO DAGScheduler: Submitting ShuffleMapStage 1 (MapPartitionsRDD[29] at mapToPair at PSFilter.java:128), which has no missing parents 23/05/23 13:20:06 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 5.6 KB, free 399.8 GB) 23/05/23 13:20:06 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 3.3 KB, free 399.8 GB) 23/05/23 13:20:06 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on d01.capitalbiotech.local:41352 (size: 3.3 KB, free: 399.8 GB) 23/05/23 13:20:06 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1163 23/05/23 13:20:06 INFO DAGScheduler: Submitting 244 missing tasks from ShuffleMapStage 1 (MapPartitionsRDD[29] at mapToPair at PSFilter.java:128) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 23/05/23 13:20:06 INFO TaskSchedulerImpl: Adding task set 1.0 with 244 tasks 23/05/23 13:20:06 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 244, localhost, executor driver, partition 0, PROCESS_LOCAL, 7651 bytes) ... 23/05/23 13:20:06 INFO TaskSetManager: Starting task 127.0 in stage 1.0 (TID 371, localhost, executor driver, partition 127, PROCESS_LOCAL, 7651 bytes) 23/05/23 13:20:06 INFO Executor: Running task 0.0 in stage 1.0 (TID 244) ... 23/05/23 13:20:06 INFO Executor: Running task 108.0 in stage 1.0 (TID 352) 23/05/23 13:20:06 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks including 0 local blocks and 0 remote blocks ... 23/05/23 13:20:06 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks including 0 local blocks and 0 remote blocks 23/05/23 13:20:06 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms 23/05/23 13:20:06 INFO Executor: Finished task 111.0 in stage 1.0 (TID 355). 1300 bytes result sent to driver 23/05/23 13:20:06 INFO TaskSetManager: Starting task 128.0 in stage 1.0 (TID 372, localhost, executor driver, partition 128, PROCESS_LOCAL, 7651 bytes) 23/05/23 13:20:06 INFO TaskSetManager: Finished task 111.0 in stage 1.0 (TID 355) in 315 ms on localhost (executor driver) (1/244) 23/05/23 13:20:06 INFO Executor: Running task 128.0 in stage 1.0 (TID 372) ... 23/05/23 13:20:09 INFO TaskSetManager: Finished task 128.0 in stage 1.0 (TID 372) in 2495 ms on localhost (executor driver) (244/244) 23/05/23 13:20:09 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 23/05/23 13:20:09 INFO DAGScheduler: ShuffleMapStage 1 (mapToPair at PSFilter.java:128) finished in 2.853 s 23/05/23 13:20:09 INFO DAGScheduler: looking for newly runnable stages 23/05/23 13:20:09 INFO DAGScheduler: running: Set() 23/05/23 13:20:09 INFO DAGScheduler: waiting: Set(ShuffleMapStage 2, ResultStage 3) 23/05/23 13:20:09 INFO DAGScheduler: failed: Set() 23/05/23 13:20:09 INFO DAGScheduler: Submitting ShuffleMapStage 2 (MapPartitionsRDD[34] at mapToPair at PSFilter.java:128), which has no missing parents 23/05/23 13:20:09 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 8.1 KB, free 399.8 GB) 23/05/23 13:20:09 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 4.5 KB, free 399.8 GB) 23/05/23 13:20:09 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on d01.capitalbiotech.local:41352 (size: 4.5 KB, free: 399.8 GB) 23/05/23 13:20:09 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1163 23/05/23 13:20:09 INFO DAGScheduler: Submitting 244 missing tasks from ShuffleMapStage 2 (MapPartitionsRDD[34] at mapToPair at PSFilter.java:128) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 23/05/23 13:20:09 INFO TaskSchedulerImpl: Adding task set 2.0 with 244 tasks 23/05/23 13:20:09 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 488, localhost, executor driver, partition 0, PROCESS_LOCAL, 7651 bytes) ... 23/05/23 13:20:09 INFO TaskSetManager: Starting task 127.0 in stage 2.0 (TID 615, localhost, executor driver, partition 127, PROCESS_LOCAL, 7651 bytes) 23/05/23 13:20:09 INFO Executor: Running task 0.0 in stage 2.0 (TID 488) ... 23/05/23 13:20:09 INFO Executor: Running task 21.0 in stage 2.0 (TID 509) 23/05/23 13:20:09 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks including 0 local blocks and 0 remote blocks 23/05/23 13:20:09 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms ... 23/05/23 13:20:09 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 23/05/23 13:20:09 INFO Executor: Finished task 100.0 in stage 2.0 (TID 588). 1343 bytes result sent to driver 23/05/23 13:20:09 INFO TaskSetManager: Starting task 128.0 in stage 2.0 (TID 616, localhost, executor driver, partition 128, PROCESS_LOCAL, 7651 bytes) 23/05/23 13:20:09 INFO Executor: Running task 128.0 in stage 2.0 (TID 616) 23/05/23 13:20:09 INFO TaskSetManager: Finished task 100.0 in stage 2.0 (TID 588) in 84 ms on localhost (executor driver) (1/244) 23/05/23 13:20:09 INFO Executor: Finished task 105.0 in stage 2.0 (TID 593). 1300 bytes result sent to driver ... 23/05/23 13:20:11 INFO TaskSetManager: Finished task 187.0 in stage 2.0 (TID 675) in 2115 ms on localhost (executor driver) (244/244) 23/05/23 13:20:11 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 23/05/23 13:20:11 INFO DAGScheduler: ShuffleMapStage 2 (mapToPair at PSFilter.java:128) finished in 2.752 s 23/05/23 13:20:11 INFO DAGScheduler: looking for newly runnable stages 23/05/23 13:20:11 INFO DAGScheduler: running: Set() 23/05/23 13:20:11 INFO DAGScheduler: waiting: Set(ResultStage 3) 23/05/23 13:20:11 INFO DAGScheduler: failed: Set() 23/05/23 13:20:11 INFO DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[38] at flatMap at PSPairedUnpairedSplitterSpark.java:50), which has no missing parents 23/05/23 13:20:11 INFO MemoryStore: Block broadcast_5 stored as values in memory (estimated size 5.6 KB, free 399.8 GB) 23/05/23 13:20:11 INFO MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 3.2 KB, free 399.8 GB) 23/05/23 13:20:11 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on d01.capitalbiotech.local:41352 (size: 3.2 KB, free: 399.8 GB) 23/05/23 13:20:11 INFO SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:1163 23/05/23 13:20:11 INFO DAGScheduler: Submitting 244 missing tasks from ResultStage 3 (MapPartitionsRDD[38] at flatMap at PSPairedUnpairedSplitterSpark.java:50) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 23/05/23 13:20:11 INFO TaskSchedulerImpl: Adding task set 3.0 with 244 tasks 23/05/23 13:20:11 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 732, localhost, executor driver, partition 0, PROCESS_LOCAL, 7662 bytes) ... 23/05/23 13:20:11 INFO TaskSetManager: Starting task 127.0 in stage 3.0 (TID 859, localhost, executor driver, partition 127, PROCESS_LOCAL, 7662 bytes) 23/05/23 13:20:11 INFO Executor: Running task 0.0 in stage 3.0 (TID 732) ... 23/05/23 13:20:11 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks including 0 local blocks and 0 remote blocks 23/05/23 13:20:11 INFO Executor: Finished task 37.0 in stage 3.0 (TID 769). 967 bytes result sent to driver 23/05/23 13:20:11 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 12 ms 23/05/23 13:20:11 INFO Executor: Finished task 48.0 in stage 3.0 (TID 780). 1010 bytes result sent to driver 23/05/23 13:20:11 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms 23/05/23 13:20:11 INFO TaskSetManager: Finished task 60.0 in stage 3.0 (TID 792) in 42 ms on localhost (executor driver) (2/244) 23/05/23 13:20:11 INFO Executor: Finished task 45.0 in stage 3.0 (TID 777). 967 bytes result sent to driver 23/05/23 13:20:11 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 12 ms ... 23/05/23 13:20:12 INFO TaskSetManager: Finished task 235.0 in stage 3.0 (TID 967) in 22 ms on localhost (executor driver) (244/244) 23/05/23 13:20:12 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks have all completed, from pool 23/05/23 13:20:12 INFO DAGScheduler: ResultStage 3 (count at PathSeqPipelineSpark.java:244) finished in 0.156 s 23/05/23 13:20:12 INFO DAGScheduler: Job 0 finished: count at PathSeqPipelineSpark.java:244, took 39.619893 s 23/05/23 13:20:12 INFO SparkContext: Starting job: count at PathSeqPipelineSpark.java:245 23/05/23 13:20:12 INFO DAGScheduler: Got job 1 (count at PathSeqPipelineSpark.java:245) with 244 output partitions 23/05/23 13:20:12 INFO DAGScheduler: Final stage: ResultStage 7 (count at PathSeqPipelineSpark.java:245) 23/05/23 13:20:12 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 6) 23/05/23 13:20:12 INFO DAGScheduler: Missing parents: List() 23/05/23 13:20:12 INFO DAGScheduler: Submitting ResultStage 7 (MapPartitionsRDD[39] at flatMap at PSPairedUnpairedSplitterSpark.java:57), which has no missing parents 23/05/23 13:20:12 INFO MemoryStore: Block broadcast_6 stored as values in memory (estimated size 5.6 KB, free 399.8 GB) 23/05/23 13:20:12 INFO MemoryStore: Block broadcast_6_piece0 stored as bytes in memory (estimated size 3.2 KB, free 399.8 GB) 23/05/23 13:20:12 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory on d01.capitalbiotech.local:41352 (size: 3.2 KB, free: 399.8 GB) 23/05/23 13:20:12 INFO SparkContext: Created broadcast 6 from broadcast at DAGScheduler.scala:1163 23/05/23 13:20:12 INFO DAGScheduler: Submitting 244 missing tasks from ResultStage 7 (MapPartitionsRDD[39] at flatMap at PSPairedUnpairedSplitterSpark.java:57) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 23/05/23 13:20:12 INFO TaskSchedulerImpl: Adding task set 7.0 with 244 tasks 23/05/23 13:20:12 INFO TaskSetManager: Starting task 0.0 in stage 7.0 (TID 976, localhost, executor driver, partition 0, PROCESS_LOCAL, 7662 bytes) ... 23/05/23 13:20:12 INFO TaskSetManager: Starting task 127.0 in stage 7.0 (TID 1103, localhost, executor driver, partition 127, PROCESS_LOCAL, 7662 bytes) 23/05/23 13:20:12 INFO Executor: Running task 5.0 in stage 7.0 (TID 981) ... 23/05/23 13:20:12 INFO Executor: Running task 11.0 in stage 7.0 (TID 987) 23/05/23 13:20:12 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks including 0 local blocks and 0 remote blocks ... 23/05/23 13:20:12 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 5 ms 23/05/23 13:20:12 INFO Executor: Running task 129.0 in stage 7.0 (TID 1105) 23/05/23 13:20:12 INFO TaskSetManager: Finished task 126.0 in stage 7.0 (TID 1102) in 72 ms on localhost (executor driver) (2/244) 23/05/23 13:20:12 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 5 ms ... 23/05/23 13:20:12 INFO TaskSetManager: Finished task 243.0 in stage 7.0 (TID 1219) in 14 ms on localhost (executor driver) (244/244) 23/05/23 13:20:12 INFO TaskSchedulerImpl: Removed TaskSet 7.0, whose tasks have all completed, from pool 23/05/23 13:20:12 INFO DAGScheduler: ResultStage 7 (count at PathSeqPipelineSpark.java:245) finished in 0.175 s 23/05/23 13:20:12 INFO DAGScheduler: Job 1 finished: count at PathSeqPipelineSpark.java:245, took 0.184459 s 23/05/23 13:20:12 INFO SparkContext: Starting job: foreach at BwaMemIndexCache.java:84 23/05/23 13:20:12 INFO DAGScheduler: Got job 2 (foreach at BwaMemIndexCache.java:84) with 128 output partitions 23/05/23 13:20:12 INFO DAGScheduler: Final stage: ResultStage 8 (foreach at BwaMemIndexCache.java:84) 23/05/23 13:20:12 INFO DAGScheduler: Parents of final stage: List() 23/05/23 13:20:12 INFO DAGScheduler: Missing parents: List() 23/05/23 13:20:12 INFO DAGScheduler: Submitting ResultStage 8 (ParallelCollectionRDD[40] at parallelize at BwaMemIndexCache.java:84), which has no missing parents 23/05/23 13:20:12 INFO MemoryStore: Block broadcast_7 stored as values in memory (estimated size 2.4 KB, free 399.8 GB) 23/05/23 13:20:12 INFO MemoryStore: Block broadcast_7_piece0 stored as bytes in memory (estimated size 1555.0 B, free 399.8 GB) 23/05/23 13:20:12 INFO BlockManagerInfo: Added broadcast_7_piece0 in memory on d01.capitalbiotech.local:41352 (size: 1555.0 B, free: 399.8 GB) 23/05/23 13:20:12 INFO SparkContext: Created broadcast 7 from broadcast at DAGScheduler.scala:1163 23/05/23 13:20:12 INFO DAGScheduler: Submitting 128 missing tasks from ResultStage 8 (ParallelCollectionRDD[40] at parallelize at BwaMemIndexCache.java:84) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 23/05/23 13:20:12 INFO TaskSchedulerImpl: Adding task set 8.0 with 128 tasks 23/05/23 13:20:12 INFO TaskSetManager: Starting task 0.0 in stage 8.0 (TID 1220, localhost, executor driver, partition 0, PROCESS_LOCAL, 7723 bytes) ... 23/05/23 13:20:12 INFO TaskSetManager: Starting task 127.0 in stage 8.0 (TID 1347, localhost, executor driver, partition 127, PROCESS_LOCAL, 7724 bytes) 23/05/23 13:20:12 INFO Executor: Running task 0.0 in stage 8.0 (TID 1220) ... 23/05/23 13:20:12 INFO Executor: Finished task 95.0 in stage 8.0 (TID 1315). 624 bytes result sent to driver 23/05/23 13:20:12 INFO TaskSetManager: Finished task 95.0 in stage 8.0 (TID 1315) in 109 ms on localhost (executor driver) (1/128) ... 23/05/23 13:20:12 INFO TaskSetManager: Finished task 4.0 in stage 8.0 (TID 1224) in 369 ms on localhost (executor driver) (128/128) 23/05/23 13:20:12 INFO TaskSchedulerImpl: Removed TaskSet 8.0, whose tasks have all completed, from pool 23/05/23 13:20:12 INFO DAGScheduler: ResultStage 8 (foreach at BwaMemIndexCache.java:84) finished in 0.401 s 23/05/23 13:20:12 INFO DAGScheduler: Job 2 finished: foreach at BwaMemIndexCache.java:84, took 0.404961 s 23/05/23 13:20:12 INFO SparkContext: Starting job: foreach at ContainsKmerReadFilterSpark.java:46 23/05/23 13:20:12 INFO DAGScheduler: Got job 3 (foreach at ContainsKmerReadFilterSpark.java:46) with 128 output partitions 23/05/23 13:20:12 INFO DAGScheduler: Final stage: ResultStage 9 (foreach at ContainsKmerReadFilterSpark.java:46) 23/05/23 13:20:12 INFO DAGScheduler: Parents of final stage: List() 23/05/23 13:20:12 INFO DAGScheduler: Missing parents: List() 23/05/23 13:20:12 INFO DAGScheduler: Submitting ResultStage 9 (ParallelCollectionRDD[41] at parallelize at ContainsKmerReadFilterSpark.java:46), which has no missing parents 23/05/23 13:20:12 INFO MemoryStore: Block broadcast_8 stored as values in memory (estimated size 2.5 KB, free 399.8 GB) 23/05/23 13:20:12 INFO MemoryStore: Block broadcast_8_piece0 stored as bytes in memory (estimated size 1606.0 B, free 399.8 GB) 23/05/23 13:20:12 INFO BlockManagerInfo: Added broadcast_8_piece0 in memory on d01.capitalbiotech.local:41352 (size: 1606.0 B, free: 399.8 GB) 23/05/23 13:20:12 INFO SparkContext: Created broadcast 8 from broadcast at DAGScheduler.scala:1163 23/05/23 13:20:12 INFO DAGScheduler: Submitting 128 missing tasks from ResultStage 9 (ParallelCollectionRDD[41] at parallelize at ContainsKmerReadFilterSpark.java:46) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 23/05/23 13:20:12 INFO TaskSchedulerImpl: Adding task set 9.0 with 128 tasks 23/05/23 13:20:12 INFO TaskSetManager: Starting task 0.0 in stage 9.0 (TID 1348, localhost, executor driver, partition 0, PROCESS_LOCAL, 7723 bytes) ... 23/05/23 13:20:12 INFO TaskSetManager: Starting task 127.0 in stage 9.0 (TID 1475, localhost, executor driver, partition 127, PROCESS_LOCAL, 7724 bytes) 23/05/23 13:20:12 INFO Executor: Running task 0.0 in stage 9.0 (TID 1348) ... 23/05/23 13:20:12 INFO Executor: Running task 108.0 in stage 9.0 (TID 1456) 23/05/23 13:20:12 INFO Executor: Finished task 24.0 in stage 9.0 (TID 1372). 624 bytes result sent to driver 23/05/23 13:20:12 INFO TaskSetManager: Finished task 24.0 in stage 9.0 (TID 1372) in 262 ms on localhost (executor driver) (1/128) ... 23/05/23 13:20:13 INFO Executor: Finished task 79.0 in stage 9.0 (TID 1427). 667 bytes result sent to driver 23/05/23 13:20:13 INFO TaskSetManager: Finished task 79.0 in stage 9.0 (TID 1427) in 173 ms on localhost (executor driver) (128/128) 23/05/23 13:20:13 INFO TaskSchedulerImpl: Removed TaskSet 9.0, whose tasks have all completed, from pool 23/05/23 13:20:13 INFO DAGScheduler: ResultStage 9 (foreach at ContainsKmerReadFilterSpark.java:46) finished in 0.394 s 23/05/23 13:20:13 INFO DAGScheduler: Job 3 finished: foreach at ContainsKmerReadFilterSpark.java:46, took 0.396650 s 23/05/23 13:20:13 INFO MemoryStore: Block broadcast_9 stored as values in memory (estimated size 53.8 MB, free 399.8 GB) 23/05/23 13:20:13 INFO MemoryStore: Block broadcast_9_piece0 stored as bytes in memory (estimated size 1567.0 KB, free 399.8 GB) 23/05/23 13:20:13 INFO BlockManagerInfo: Added broadcast_9_piece0 in memory on d01.capitalbiotech.local:41352 (size: 1567.0 KB, free: 399.8 GB) 23/05/23 13:20:13 INFO SparkContext: Created broadcast 9 from broadcast at PathSeqPipelineSpark.java:261 23/05/23 13:20:13 INFO MemoryStore: Block broadcast_10 stored as values in memory (estimated size 15.3 MB, free 399.8 GB) 23/05/23 13:20:13 INFO MemoryStore: Block broadcast_10_piece0 stored as bytes in memory (estimated size 1285.2 KB, free 399.8 GB) 23/05/23 13:20:13 INFO BlockManagerInfo: Added broadcast_10_piece0 in memory on d01.capitalbiotech.local:41352 (size: 1285.2 KB, free: 399.8 GB) 23/05/23 13:20:13 INFO SparkContext: Created broadcast 10 from broadcast at PSScorer.java:49 23/05/23 13:20:13 INFO SparkContext: Starting job: collectAsMap at PSScorer.java:71 23/05/23 13:20:13 INFO DAGScheduler: Registering RDD 43 (repartition at PathSeqPipelineSpark.java:197) as input to shuffle 3 23/05/23 13:20:13 INFO DAGScheduler: Registering RDD 48 (repartition at PathSeqPipelineSpark.java:256) as input to shuffle 4 23/05/23 13:20:13 INFO DAGScheduler: Registering RDD 60 (mapPartitionsToPair at PSScorer.java:68) as input to shuffle 5 23/05/23 13:20:13 INFO DAGScheduler: Got job 4 (collectAsMap at PSScorer.java:71) with 2 output partitions 23/05/23 13:20:13 INFO DAGScheduler: Final stage: ResultStage 16 (collectAsMap at PSScorer.java:71) 23/05/23 13:20:13 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 15) 23/05/23 13:20:13 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 15) 23/05/23 13:20:13 INFO DAGScheduler: Submitting ShuffleMapStage 13 (MapPartitionsRDD[43] at repartition at PathSeqPipelineSpark.java:197), which has no missing parents 23/05/23 13:20:13 INFO MemoryStore: Block broadcast_11 stored as values in memory (estimated size 8.3 KB, free 399.8 GB) 23/05/23 13:20:13 INFO MemoryStore: Block broadcast_11_piece0 stored as bytes in memory (estimated size 4.4 KB, free 399.8 GB) 23/05/23 13:20:13 INFO BlockManagerInfo: Added broadcast_11_piece0 in memory on d01.capitalbiotech.local:41352 (size: 4.4 KB, free: 399.8 GB) 23/05/23 13:20:13 INFO SparkContext: Created broadcast 11 from broadcast at DAGScheduler.scala:1163 23/05/23 13:20:13 INFO DAGScheduler: Submitting 244 missing tasks from ShuffleMapStage 13 (MapPartitionsRDD[43] at repartition at PathSeqPipelineSpark.java:197) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 23/05/23 13:20:13 INFO TaskSchedulerImpl: Adding task set 13.0 with 244 tasks 23/05/23 13:20:13 INFO DAGScheduler: Submitting ShuffleMapStage 14 (MapPartitionsRDD[48] at repartition at PathSeqPipelineSpark.java:256), which has no missing parents 23/05/23 13:20:13 INFO TaskSetManager: Starting task 0.0 in stage 13.0 (TID 1476, localhost, executor driver, partition 0, PROCESS_LOCAL, 7651 bytes) ... 23/05/23 13:20:13 INFO TaskSetManager: Starting task 127.0 in stage 13.0 (TID 1603, localhost, executor driver, partition 127, PROCESS_LOCAL, 7651 bytes) 23/05/23 13:20:13 INFO Executor: Running task 1.0 in stage 13.0 (TID 1477) ... 23/05/23 13:20:13 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 5 ms 23/05/23 13:20:13 INFO BlockManagerInfo: Added broadcast_12_piece0 in memory on d01.capitalbiotech.local:41352 (size: 3.5 KB, free: 399.8 GB) 23/05/23 13:20:13 INFO SparkContext: Created broadcast 12 from broadcast at DAGScheduler.scala:1163 23/05/23 13:20:13 INFO DAGScheduler: Submitting 244 missing tasks from ShuffleMapStage 14 (MapPartitionsRDD[48] at repartition at PathSeqPipelineSpark.java:256) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 23/05/23 13:20:13 INFO TaskSchedulerImpl: Adding task set 14.0 with 244 tasks 23/05/23 13:20:14 INFO Executor: Finished task 59.0 in stage 13.0 (TID 1535). 1010 bytes result sent to driver 23/05/23 13:20:14 INFO TaskSetManager: Starting task 128.0 in stage 13.0 (TID 1604, localhost, executor driver, partition 128, PROCESS_LOCAL, 7651 bytes) 23/05/23 13:20:14 INFO Executor: Running task 128.0 in stage 13.0 (TID 1604) 23/05/23 13:20:14 INFO TaskSetManager: Finished task 59.0 in stage 13.0 (TID 1535) in 308 ms on localhost (executor driver) (1/244) 23/05/23 13:20:14 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks including 0 local blocks and 0 remote blocks 23/05/23 13:20:14 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 23/05/23 13:20:14 INFO Executor: Finished task 41.0 in stage 13.0 (TID 1517). 1010 bytes result sent to driver ... 23/05/23 13:20:15 INFO TaskSetManager: Starting task 123.0 in stage 14.0 (TID 1843, localhost, executor driver, partition 123, PROCESS_LOCAL, 7651 bytes) 23/05/23 13:20:15 INFO TaskSetManager: Finished task 118.0 in stage 14.0 (TID 1838) in 30 ms on localhost (executor driver) (4/244) 23/05/23 13:20:15 INFO Executor: Running task 123.0 in stage 14.0 (TID 1843) 23/05/23 13:20:15 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks including 0 local blocks and 0 remote blocks 23/05/23 13:20:15 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms ... 23/05/23 13:20:16 INFO Executor: Finished task 236.0 in stage 13.0 (TID 1712). 967 bytes result sent to driver 23/05/23 13:20:16 INFO TaskSetManager: Finished task 236.0 in stage 13.0 (TID 1712) in 2140 ms on localhost (executor driver) (244/244) 23/05/23 13:20:16 INFO TaskSchedulerImpl: Removed TaskSet 13.0, whose tasks have all completed, from pool 23/05/23 13:20:16 INFO DAGScheduler: ShuffleMapStage 13 (repartition at PathSeqPipelineSpark.java:197) finished in 3.179 s 23/05/23 13:20:16 INFO DAGScheduler: looking for newly runnable stages 23/05/23 13:20:16 INFO DAGScheduler: running: Set(ShuffleMapStage 14) 23/05/23 13:20:16 INFO DAGScheduler: waiting: Set(ShuffleMapStage 15, ResultStage 16) 23/05/23 13:20:16 INFO DAGScheduler: failed: Set() 23/05/23 13:20:16 INFO Executor: Finished task 243.0 in stage 14.0 (TID 1963). 1010 bytes result sent to driver 23/05/23 13:20:16 INFO TaskSetManager: Finished task 243.0 in stage 14.0 (TID 1963) in 49 ms on localhost (executor driver) (124/244) 23/05/23 13:20:16 INFO Executor: Finished task 242.0 in stage 14.0 (TID 1962). 1010 bytes result sent to driver ... 23/05/23 13:20:18 INFO Executor: Finished task 120.0 in stage 14.0 (TID 1840). 967 bytes result sent to driver 23/05/23 13:20:18 INFO TaskSetManager: Finished task 120.0 in stage 14.0 (TID 1840) in 2438 ms on localhost (executor driver) (244/244) 23/05/23 13:20:18 INFO TaskSchedulerImpl: Removed TaskSet 14.0, whose tasks have all completed, from pool 23/05/23 13:20:18 INFO DAGScheduler: ShuffleMapStage 14 (repartition at PathSeqPipelineSpark.java:256) finished in 4.303 s 23/05/23 13:20:18 INFO DAGScheduler: looking for newly runnable stages 23/05/23 13:20:18 INFO DAGScheduler: running: Set() 23/05/23 13:20:18 INFO DAGScheduler: waiting: Set(ShuffleMapStage 15, ResultStage 16) 23/05/23 13:20:18 INFO DAGScheduler: failed: Set() 23/05/23 13:20:18 INFO DAGScheduler: Submitting ShuffleMapStage 15 (MapPartitionsRDD[60] at mapPartitionsToPair at PSScorer.java:68), which has no missing parents 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_13 stored as values in memory (estimated size 12.4 KB, free 399.8 GB) 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_13_piece0 stored as bytes in memory (estimated size 6.4 KB, free 399.8 GB) 23/05/23 13:20:18 INFO BlockManagerInfo: Added broadcast_13_piece0 in memory on d01.capitalbiotech.local:41352 (size: 6.4 KB, free: 399.8 GB) 23/05/23 13:20:18 INFO SparkContext: Created broadcast 13 from broadcast at DAGScheduler.scala:1163 23/05/23 13:20:18 INFO DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 15 (MapPartitionsRDD[60] at mapPartitionsToPair at PSScorer.java:68) (first 15 tasks are for partitions Vector(0, 1)) 23/05/23 13:20:18 INFO TaskSchedulerImpl: Adding task set 15.0 with 2 tasks 23/05/23 13:20:18 INFO TaskSetManager: Starting task 0.0 in stage 15.0 (TID 1964, localhost, executor driver, partition 0, PROCESS_LOCAL, 8036 bytes) 23/05/23 13:20:18 INFO TaskSetManager: Starting task 1.0 in stage 15.0 (TID 1965, localhost, executor driver, partition 1, PROCESS_LOCAL, 8036 bytes) 23/05/23 13:20:18 INFO Executor: Running task 0.0 in stage 15.0 (TID 1964) 23/05/23 13:20:18 INFO Executor: Running task 1.0 in stage 15.0 (TID 1965) 23/05/23 13:20:18 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks including 0 local blocks and 0 remote blocks 23/05/23 13:20:18 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks including 0 local blocks and 0 remote blocks 23/05/23 13:20:18 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 23/05/23 13:20:18 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 1 ms 23/05/23 13:20:18 INFO MemoryStore: Block rdd_53_0 stored as values in memory (estimated size 0.0 B, free 399.8 GB) 23/05/23 13:20:18 INFO MemoryStore: Block rdd_52_0 stored as values in memory (estimated size 0.0 B, free 399.8 GB) 23/05/23 13:20:18 INFO BlockManagerInfo: Added rdd_52_0 in memory on d01.capitalbiotech.local:41352 (size: 0.0 B, free: 399.8 GB) 23/05/23 13:20:18 INFO BlockManagerInfo: Added rdd_53_0 in memory on d01.capitalbiotech.local:41352 (size: 0.0 B, free: 399.8 GB) 23/05/23 13:20:18 INFO Executor: Finished task 0.0 in stage 15.0 (TID 1964). 1226 bytes result sent to driver 23/05/23 13:20:18 INFO TaskSetManager: Finished task 0.0 in stage 15.0 (TID 1964) in 147 ms on localhost (executor driver) (1/2) 23/05/23 13:20:18 INFO Executor: Finished task 1.0 in stage 15.0 (TID 1965). 1183 bytes result sent to driver 23/05/23 13:20:18 INFO TaskSetManager: Finished task 1.0 in stage 15.0 (TID 1965) in 146 ms on localhost (executor driver) (2/2) 23/05/23 13:20:18 INFO TaskSchedulerImpl: Removed TaskSet 15.0, whose tasks have all completed, from pool 23/05/23 13:20:18 INFO DAGScheduler: ShuffleMapStage 15 (mapPartitionsToPair at PSScorer.java:68) finished in 0.185 s 23/05/23 13:20:18 INFO DAGScheduler: looking for newly runnable stages 23/05/23 13:20:18 INFO DAGScheduler: running: Set() 23/05/23 13:20:18 INFO DAGScheduler: waiting: Set(ResultStage 16) 23/05/23 13:20:18 INFO DAGScheduler: failed: Set() 23/05/23 13:20:18 INFO DAGScheduler: Submitting ResultStage 16 (ShuffledRDD[61] at reduceByKey at PSScorer.java:71), which has no missing parents 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_14 stored as values in memory (estimated size 4.7 KB, free 399.8 GB) 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_14_piece0 stored as bytes in memory (estimated size 2.6 KB, free 399.8 GB) 23/05/23 13:20:18 INFO BlockManagerInfo: Added broadcast_14_piece0 in memory on d01.capitalbiotech.local:41352 (size: 2.6 KB, free: 399.8 GB) 23/05/23 13:20:18 INFO SparkContext: Created broadcast 14 from broadcast at DAGScheduler.scala:1163 23/05/23 13:20:18 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 16 (ShuffledRDD[61] at reduceByKey at PSScorer.java:71) (first 15 tasks are for partitions Vector(0, 1)) 23/05/23 13:20:18 INFO TaskSchedulerImpl: Adding task set 16.0 with 2 tasks 23/05/23 13:20:18 INFO TaskSetManager: Starting task 0.0 in stage 16.0 (TID 1966, localhost, executor driver, partition 0, PROCESS_LOCAL, 7662 bytes) 23/05/23 13:20:18 INFO TaskSetManager: Starting task 1.0 in stage 16.0 (TID 1967, localhost, executor driver, partition 1, PROCESS_LOCAL, 7662 bytes) 23/05/23 13:20:18 INFO Executor: Running task 0.0 in stage 16.0 (TID 1966) 23/05/23 13:20:18 INFO Executor: Running task 1.0 in stage 16.0 (TID 1967) 23/05/23 13:20:18 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks including 0 local blocks and 0 remote blocks 23/05/23 13:20:18 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks including 0 local blocks and 0 remote blocks 23/05/23 13:20:18 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 23/05/23 13:20:18 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 23/05/23 13:20:18 INFO Executor: Finished task 1.0 in stage 16.0 (TID 1967). 1098 bytes result sent to driver 23/05/23 13:20:18 INFO Executor: Finished task 0.0 in stage 16.0 (TID 1966). 1098 bytes result sent to driver 23/05/23 13:20:18 INFO TaskSetManager: Finished task 1.0 in stage 16.0 (TID 1967) in 23 ms on localhost (executor driver) (1/2) 23/05/23 13:20:18 INFO TaskSetManager: Finished task 0.0 in stage 16.0 (TID 1966) in 24 ms on localhost (executor driver) (2/2) 23/05/23 13:20:18 INFO TaskSchedulerImpl: Removed TaskSet 16.0, whose tasks have all completed, from pool 23/05/23 13:20:18 INFO DAGScheduler: ResultStage 16 (collectAsMap at PSScorer.java:71) finished in 0.034 s 23/05/23 13:20:18 INFO DAGScheduler: Job 4 finished: collectAsMap at PSScorer.java:71, took 4.580518 s 23/05/23 13:20:18 INFO SparkContext: Starting job: collect at PSBwaUtils.java:59 23/05/23 13:20:18 INFO DAGScheduler: Registering RDD 63 (distinct at PSBwaUtils.java:59) as input to shuffle 6 23/05/23 13:20:18 INFO DAGScheduler: Got job 5 (collect at PSBwaUtils.java:59) with 2 output partitions 23/05/23 13:20:18 INFO DAGScheduler: Final stage: ResultStage 23 (collect at PSBwaUtils.java:59) 23/05/23 13:20:18 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 22) 23/05/23 13:20:18 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 22) 23/05/23 13:20:18 INFO DAGScheduler: Submitting ShuffleMapStage 22 (MapPartitionsRDD[63] at distinct at PSBwaUtils.java:59), which has no missing parents 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_15 stored as values in memory (estimated size 11.8 KB, free 399.8 GB) 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_15_piece0 stored as bytes in memory (estimated size 6.2 KB, free 399.8 GB) 23/05/23 13:20:18 INFO BlockManagerInfo: Added broadcast_15_piece0 in memory on d01.capitalbiotech.local:41352 (size: 6.2 KB, free: 399.8 GB) 23/05/23 13:20:18 INFO SparkContext: Created broadcast 15 from broadcast at DAGScheduler.scala:1163 23/05/23 13:20:18 INFO DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 22 (MapPartitionsRDD[63] at distinct at PSBwaUtils.java:59) (first 15 tasks are for partitions Vector(0, 1)) 23/05/23 13:20:18 INFO TaskSchedulerImpl: Adding task set 22.0 with 2 tasks 23/05/23 13:20:18 INFO TaskSetManager: Starting task 0.0 in stage 22.0 (TID 1968, localhost, executor driver, partition 0, PROCESS_LOCAL, 8036 bytes) 23/05/23 13:20:18 INFO TaskSetManager: Starting task 1.0 in stage 22.0 (TID 1969, localhost, executor driver, partition 1, PROCESS_LOCAL, 8036 bytes) 23/05/23 13:20:18 INFO Executor: Running task 1.0 in stage 22.0 (TID 1969) 23/05/23 13:20:18 INFO Executor: Running task 0.0 in stage 22.0 (TID 1968) 23/05/23 13:20:18 INFO BlockManager: Found block rdd_53_0 locally 23/05/23 13:20:18 INFO BlockManager: Found block rdd_52_0 locally 23/05/23 13:20:18 INFO Executor: Finished task 1.0 in stage 22.0 (TID 1969). 925 bytes result sent to driver 23/05/23 13:20:18 INFO TaskSetManager: Finished task 1.0 in stage 22.0 (TID 1969) in 31 ms on localhost (executor driver) (1/2) 23/05/23 13:20:18 INFO Executor: Finished task 0.0 in stage 22.0 (TID 1968). 925 bytes result sent to driver 23/05/23 13:20:18 INFO TaskSetManager: Finished task 0.0 in stage 22.0 (TID 1968) in 38 ms on localhost (executor driver) (2/2) 23/05/23 13:20:18 INFO TaskSchedulerImpl: Removed TaskSet 22.0, whose tasks have all completed, from pool 23/05/23 13:20:18 INFO DAGScheduler: ShuffleMapStage 22 (distinct at PSBwaUtils.java:59) finished in 0.053 s 23/05/23 13:20:18 INFO DAGScheduler: looking for newly runnable stages 23/05/23 13:20:18 INFO DAGScheduler: running: Set() 23/05/23 13:20:18 INFO DAGScheduler: waiting: Set(ResultStage 23) 23/05/23 13:20:18 INFO DAGScheduler: failed: Set() 23/05/23 13:20:18 INFO DAGScheduler: Submitting ResultStage 23 (MapPartitionsRDD[65] at distinct at PSBwaUtils.java:59), which has no missing parents 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_16 stored as values in memory (estimated size 4.3 KB, free 399.8 GB) 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_16_piece0 stored as bytes in memory (estimated size 2.5 KB, free 399.8 GB) 23/05/23 13:20:18 INFO BlockManagerInfo: Added broadcast_16_piece0 in memory on d01.capitalbiotech.local:41352 (size: 2.5 KB, free: 399.8 GB) 23/05/23 13:20:18 INFO SparkContext: Created broadcast 16 from broadcast at DAGScheduler.scala:1163 23/05/23 13:20:18 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 23 (MapPartitionsRDD[65] at distinct at PSBwaUtils.java:59) (first 15 tasks are for partitions Vector(0, 1)) 23/05/23 13:20:18 INFO TaskSchedulerImpl: Adding task set 23.0 with 2 tasks 23/05/23 13:20:18 INFO TaskSetManager: Starting task 0.0 in stage 23.0 (TID 1970, localhost, executor driver, partition 0, PROCESS_LOCAL, 7662 bytes) 23/05/23 13:20:18 INFO TaskSetManager: Starting task 1.0 in stage 23.0 (TID 1971, localhost, executor driver, partition 1, PROCESS_LOCAL, 7662 bytes) 23/05/23 13:20:18 INFO Executor: Running task 0.0 in stage 23.0 (TID 1970) 23/05/23 13:20:18 INFO Executor: Running task 1.0 in stage 23.0 (TID 1971) 23/05/23 13:20:18 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks including 0 local blocks and 0 remote blocks 23/05/23 13:20:18 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks including 0 local blocks and 0 remote blocks 23/05/23 13:20:18 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 23/05/23 13:20:18 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 23/05/23 13:20:18 INFO Executor: Finished task 1.0 in stage 23.0 (TID 1971). 1098 bytes result sent to driver 23/05/23 13:20:18 INFO Executor: Finished task 0.0 in stage 23.0 (TID 1970). 1098 bytes result sent to driver 23/05/23 13:20:18 INFO TaskSetManager: Finished task 1.0 in stage 23.0 (TID 1971) in 14 ms on localhost (executor driver) (1/2) 23/05/23 13:20:18 INFO TaskSetManager: Finished task 0.0 in stage 23.0 (TID 1970) in 15 ms on localhost (executor driver) (2/2) 23/05/23 13:20:18 INFO TaskSchedulerImpl: Removed TaskSet 23.0, whose tasks have all completed, from pool 23/05/23 13:20:18 INFO DAGScheduler: ResultStage 23 (collect at PSBwaUtils.java:59) finished in 0.026 s 23/05/23 13:20:18 INFO DAGScheduler: Job 5 finished: collect at PSBwaUtils.java:59, took 0.091434 s 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_17 stored as values in memory (estimated size 7.3 KB, free 399.8 GB) 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_17_piece0 stored as bytes in memory (estimated size 679.0 B, free 399.8 GB) 23/05/23 13:20:18 INFO BlockManagerInfo: Added broadcast_17_piece0 in memory on d01.capitalbiotech.local:41352 (size: 679.0 B, free: 399.8 GB) 23/05/23 13:20:18 INFO SparkContext: Created broadcast 17 from broadcast at ReadsSparkSink.java:146 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_18 stored as values in memory (estimated size 7.3 KB, free 399.8 GB) 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_18_piece0 stored as bytes in memory (estimated size 679.0 B, free 399.8 GB) 23/05/23 13:20:18 INFO BlockManagerInfo: Added broadcast_18_piece0 in memory on d01.capitalbiotech.local:41352 (size: 679.0 B, free: 399.8 GB) 23/05/23 13:20:18 INFO SparkContext: Created broadcast 18 from broadcast at BamSink.java:76 23/05/23 13:20:18 INFO FileOutputCommitter: File Output Committer Algorithm version is 2 23/05/23 13:20:18 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false 23/05/23 13:20:18 INFO SparkContext: Starting job: runJob at SparkHadoopWriter.scala:78 23/05/23 13:20:18 INFO DAGScheduler: Registering RDD 68 (mapToPair at SparkUtils.java:161) as input to shuffle 7 23/05/23 13:20:18 INFO DAGScheduler: Got job 6 (runJob at SparkHadoopWriter.scala:78) with 1 output partitions 23/05/23 13:20:18 INFO DAGScheduler: Final stage: ResultStage 30 (runJob at SparkHadoopWriter.scala:78) 23/05/23 13:20:18 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 29) 23/05/23 13:20:18 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 29) 23/05/23 13:20:18 INFO DAGScheduler: Submitting ShuffleMapStage 29 (MapPartitionsRDD[68] at mapToPair at SparkUtils.java:161), which has no missing parents 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_19 stored as values in memory (estimated size 14.8 KB, free 399.8 GB) 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_19_piece0 stored as bytes in memory (estimated size 7.9 KB, free 399.8 GB) 23/05/23 13:20:18 INFO BlockManagerInfo: Added broadcast_19_piece0 in memory on d01.capitalbiotech.local:41352 (size: 7.9 KB, free: 399.8 GB) 23/05/23 13:20:18 INFO SparkContext: Created broadcast 19 from broadcast at DAGScheduler.scala:1163 23/05/23 13:20:18 INFO DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 29 (MapPartitionsRDD[68] at mapToPair at SparkUtils.java:161) (first 15 tasks are for partitions Vector(0)) 23/05/23 13:20:18 INFO TaskSchedulerImpl: Adding task set 29.0 with 1 tasks 23/05/23 13:20:18 INFO TaskSetManager: Starting task 0.0 in stage 29.0 (TID 1972, localhost, executor driver, partition 0, ANY, 8159 bytes) 23/05/23 13:20:18 INFO Executor: Running task 0.0 in stage 29.0 (TID 1972) 23/05/23 13:20:18 INFO BlockManager: Found block rdd_52_0 locally 23/05/23 13:20:18 INFO BlockManager: Found block rdd_53_0 locally 23/05/23 13:20:18 INFO Executor: Finished task 0.0 in stage 29.0 (TID 1972). 752 bytes result sent to driver 23/05/23 13:20:18 INFO TaskSetManager: Finished task 0.0 in stage 29.0 (TID 1972) in 43 ms on localhost (executor driver) (1/1) 23/05/23 13:20:18 INFO TaskSchedulerImpl: Removed TaskSet 29.0, whose tasks have all completed, from pool 23/05/23 13:20:18 INFO DAGScheduler: ShuffleMapStage 29 (mapToPair at SparkUtils.java:161) finished in 0.065 s 23/05/23 13:20:18 INFO DAGScheduler: looking for newly runnable stages 23/05/23 13:20:18 INFO DAGScheduler: running: Set() 23/05/23 13:20:18 INFO DAGScheduler: waiting: Set(ResultStage 30) 23/05/23 13:20:18 INFO DAGScheduler: failed: Set() 23/05/23 13:20:18 INFO DAGScheduler: Submitting ResultStage 30 (MapPartitionsRDD[73] at mapToPair at BamSink.java:91), which has no missing parents 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_20 stored as values in memory (estimated size 91.7 KB, free 399.8 GB) 23/05/23 13:20:18 INFO MemoryStore: Block broadcast_20_piece0 stored as bytes in memory (estimated size 42.1 KB, free 399.8 GB) 23/05/23 13:20:18 INFO BlockManagerInfo: Added broadcast_20_piece0 in memory on d01.capitalbiotech.local:41352 (size: 42.1 KB, free: 399.8 GB) 23/05/23 13:20:18 INFO SparkContext: Created broadcast 20 from broadcast at DAGScheduler.scala:1163 23/05/23 13:20:18 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 30 (MapPartitionsRDD[73] at mapToPair at BamSink.java:91) (first 15 tasks are for partitions Vector(0)) 23/05/23 13:20:18 INFO TaskSchedulerImpl: Adding task set 30.0 with 1 tasks 23/05/23 13:20:18 INFO TaskSetManager: Starting task 0.0 in stage 30.0 (TID 1973, localhost, executor driver, partition 0, PROCESS_LOCAL, 7662 bytes) 23/05/23 13:20:18 INFO Executor: Running task 0.0 in stage 30.0 (TID 1973) 23/05/23 13:20:18 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks including 0 local blocks and 0 remote blocks 23/05/23 13:20:18 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 23/05/23 13:20:18 INFO FileOutputCommitter: File Output Committer Algorithm version is 2 23/05/23 13:20:18 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false 23/05/23 13:20:18 INFO FileOutputCommitter: File Output Committer Algorithm version is 2 23/05/23 13:20:18 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false 23/05/23 13:20:18 INFO FileOutputCommitter: Saved output of task 'attempt_20230523132018_0073_r_000000_0' to file:pathseq/CRC_16.pathseq.complete.bam.parts 23/05/23 13:20:18 INFO SparkHadoopMapRedUtil: attempt_20230523132018_0073_r_000000_0: Committed 23/05/23 13:20:18 INFO Executor: Finished task 0.0 in stage 30.0 (TID 1973). 1149 bytes result sent to driver 23/05/23 13:20:18 INFO TaskSetManager: Finished task 0.0 in stage 30.0 (TID 1973) in 184 ms on localhost (executor driver) (1/1) 23/05/23 13:20:18 INFO TaskSchedulerImpl: Removed TaskSet 30.0, whose tasks have all completed, from pool 23/05/23 13:20:18 INFO DAGScheduler: ResultStage 30 (runJob at SparkHadoopWriter.scala:78) finished in 0.215 s 23/05/23 13:20:18 INFO DAGScheduler: Job 6 finished: runJob at SparkHadoopWriter.scala:78, took 0.298046 s 23/05/23 13:20:19 INFO SparkHadoopWriter: Job job_20230523132018_0073 committed. 23/05/23 13:20:19 INFO HadoopFileSystemWrapper: Concatenating 2 parts to pathseq/CRC_16.pathseq.complete.bam 23/05/23 13:20:19 INFO HadoopFileSystemWrapper: Concatenating to pathseq/CRC_16.pathseq.complete.bam done 23/05/23 13:20:19 INFO IndexFileMerger: Merging .sbi files in temp directory pathseq/CRC_16.pathseq.complete.bam.parts/ to pathseq/CRC_16.pathseq.complete.bam.sbi 23/05/23 13:20:19 INFO IndexFileMerger: Done merging .sbi files 23/05/23 13:20:19 INFO IndexFileMerger: Merging .bai files in temp directory pathseq/CRC_16.pathseq.complete.bam.parts/ to pathseq/CRC_16.pathseq.complete.bam.bai 23/05/23 13:20:19 INFO IndexFileMerger: Done merging .bai files 23/05/23 13:20:19 INFO SparkContext: Starting job: foreach at BwaMemIndexCache.java:84 23/05/23 13:20:19 INFO DAGScheduler: Got job 7 (foreach at BwaMemIndexCache.java:84) with 128 output partitions 23/05/23 13:20:19 INFO DAGScheduler: Final stage: ResultStage 31 (foreach at BwaMemIndexCache.java:84) 23/05/23 13:20:19 INFO DAGScheduler: Parents of final stage: List() 23/05/23 13:20:19 INFO DAGScheduler: Missing parents: List() 23/05/23 13:20:19 INFO DAGScheduler: Submitting ResultStage 31 (ParallelCollectionRDD[74] at parallelize at BwaMemIndexCache.java:84), which has no missing parents 23/05/23 13:20:19 INFO MemoryStore: Block broadcast_21 stored as values in memory (estimated size 2.4 KB, free 399.8 GB) 23/05/23 13:20:19 INFO MemoryStore: Block broadcast_21_piece0 stored as bytes in memory (estimated size 1555.0 B, free 399.8 GB) 23/05/23 13:20:19 INFO BlockManagerInfo: Added broadcast_21_piece0 in memory on d01.capitalbiotech.local:41352 (size: 1555.0 B, free: 399.8 GB) 23/05/23 13:20:19 INFO SparkContext: Created broadcast 21 from broadcast at DAGScheduler.scala:1163 23/05/23 13:20:19 INFO DAGScheduler: Submitting 128 missing tasks from ResultStage 31 (ParallelCollectionRDD[74] at parallelize at BwaMemIndexCache.java:84) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 23/05/23 13:20:19 INFO TaskSchedulerImpl: Adding task set 31.0 with 128 tasks 23/05/23 13:20:19 INFO TaskSetManager: Starting task 0.0 in stage 31.0 (TID 1974, localhost, executor driver, partition 0, PROCESS_LOCAL, 7723 bytes) ... 23/05/23 13:20:19 INFO TaskSetManager: Starting task 127.0 in stage 31.0 (TID 2101, localhost, executor driver, partition 127, PROCESS_LOCAL, 7724 bytes) 23/05/23 13:20:19 INFO Executor: Running task 0.0 in stage 31.0 (TID 1974) ... 23/05/23 13:20:19 INFO Executor: Running task 109.0 in stage 31.0 (TID 2083) 23/05/23 13:20:19 INFO Executor: Finished task 66.0 in stage 31.0 (TID 2040). 667 bytes result sent to driver 23/05/23 13:20:19 INFO Executor: Finished task 2.0 in stage 31.0 (TID 1976). 667 bytes result sent to driver 23/05/23 13:20:19 INFO TaskSetManager: Finished task 66.0 in stage 31.0 (TID 2040) in 160 ms on localhost (executor driver) (1/128) 23/05/23 13:20:19 INFO TaskSetManager: Finished task 2.0 in stage 31.0 (TID 1976) in 330 ms on localhost (executor driver) (2/128) 23/05/23 13:20:19 INFO Executor: Finished task 3.0 in stage 31.0 (TID 1977). 667 bytes result sent to driver ... 23/05/23 13:20:19 INFO TaskSetManager: Finished task 97.0 in stage 31.0 (TID 2071) in 123 ms on localhost (executor driver) (127/128) 23/05/23 13:20:19 INFO TaskSetManager: Finished task 112.0 in stage 31.0 (TID 2086) in 88 ms on localhost (executor driver) (128/128) 23/05/23 13:20:19 INFO TaskSchedulerImpl: Removed TaskSet 31.0, whose tasks have all completed, from pool 23/05/23 13:20:19 INFO DAGScheduler: ResultStage 31 (foreach at BwaMemIndexCache.java:84) finished in 0.389 s 23/05/23 13:20:19 INFO DAGScheduler: Job 7 finished: foreach at BwaMemIndexCache.java:84, took 0.392269 s 23/05/23 13:20:19 INFO SparkUI: Stopped Spark web UI at http://d01.capitalbiotech.local:4040 23/05/23 13:20:19 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 23/05/23 13:20:26 INFO MemoryStore: MemoryStore cleared 23/05/23 13:20:26 INFO BlockManager: BlockManager stopped 23/05/23 13:20:26 INFO BlockManagerMaster: BlockManagerMaster stopped 23/05/23 13:20:26 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 23/05/23 13:20:26 INFO SparkContext: Successfully stopped SparkContext 13:20:26.099 INFO PathSeqPipelineSpark - Shutting down engine [May 23, 2023 1:20:26 PM CST] org.broadinstitute.hellbender.tools.spark.pathseq.PathSeqPipelineSpark done. Elapsed time: 1.04 minutes. Runtime.totalMemory()=156475326464 23/05/23 13:20:26 INFO ShutdownHookManager: Shutdown hook called 23/05/23 13:20:26 INFO ShutdownHookManager: Deleting directory pathseq/tmp/spark-2042a18b-a4af-4a86-a236-c4914f0407a1