broadinstitute / gatk

Official code repository for GATK versions 4 and up
https://software.broadinstitute.org/gatk
Other
1.68k stars 589 forks source link

PathSeq org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile: Couldn't read file. #4699

Closed SZLux closed 6 years ago

SZLux commented 6 years ago

Hi, I run the PathSeqPipelineSpark on a SPARK HPC with a master and several workers.

I downloaded SPARK 2.2.0 with hadoop 2.7.3 Java is 1.8.0_131 I set the java classpath (I think correctly)

The command runs well without the --spark-master option, so the files are at the right place, but when I run the following command line: gatk PathSeqPipelineSpark --spark-master spark://XX.XX.XX.XX:7077 --input test_sample.bam --filter-bwa-image hg19mini.fasta.img --kmer-file hg19mini.hss --min-clipped-read-length 70 --microbe-fasta e_coli_k12.fasta --microbe-bwa-image e_coli_k12.fasta.img --taxonomy-file e_coli_k12.db --output output.pathseq.bam --verbosity DEBUG --scores-output output.pathseq.txt -- --spark-runner SPARK

I get the following error:

18/04/24 17:55:54 WARN TaskSetManager: Lost task 1.0 in stage 2.0 (TID 4, xx.xx.xx.16, executor 3): **org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile: Couldn't read file. Error was: hg19mini.hss with exception: hg19mini.hss (No such file or directory)
        at org.broadinstitute.hellbender.utils.gcs.BucketUtils.openFile(BucketUtils.java:112)**
        at org.broadinstitute.hellbender.tools.spark.pathseq.PSKmerUtils.readKmerFilter(PSKmerUtils.java:131)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilter.<init>(ContainsKmerReadFilter.java:27)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilterSpark.call(ContainsKmerReadFilterSpark.java:35)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilterSpark.call(ContainsKmerReadFilterSpark.java:15)
        at org.apache.spark.api.java.JavaRDD$$anonfun$filter$1.apply(JavaRDD.scala:78)
        at org.apache.spark.api.java.JavaRDD$$anonfun$filter$1.apply(JavaRDD.scala:78)
        at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
        at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: hg19mini.hss (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at java.io.FileInputStream.<init>(FileInputStream.java:93)
        at org.broadinstitute.hellbender.utils.gcs.BucketUtils.openFile(BucketUtils.java:103)
        ... 16 more

Thank you.

Full log:

17:54:54.447 WARN  SparkContextFactory - Environment variables HELLBENDER_TEST_PROJECT and HELLBENDER_JSON_SERVICE_ACCOUNT_KEY must be set or the GCS hadoop connector will not be configured properly
17:54:54.891 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/scratch/home/int/eva/userx/bin/gatk-4.0.3.0/gatk-package-4.0.3.0-spark.jar!/com/intel/gkl/native/libgkl_compression.so
17:54:54.924 DEBUG NativeLibraryLoader - Extracting libgkl_compression.so to /tmp/userx/libgkl_compression2910983555987484852.so
17:54:55.293 INFO  PathSeqPipelineSpark - ------------------------------------------------------------
17:54:55.294 INFO  PathSeqPipelineSpark - The Genome Analysis Toolkit (GATK) v4.0.3.0
17:54:55.294 INFO  PathSeqPipelineSpark - For support and documentation go to https://software.broadinstitute.org/gatk/
17:54:55.295 INFO  PathSeqPipelineSpark - Executing as userx@node016 on Linux v2.6.32-220.4.1.el6.x86_64 amd64
17:54:55.295 INFO  PathSeqPipelineSpark - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_131-b11
17:54:55.295 INFO  PathSeqPipelineSpark - Start Date/Time: April 24, 2018 5:54:54 PM CEST
17:54:55.295 INFO  PathSeqPipelineSpark - ------------------------------------------------------------
17:54:55.296 INFO  PathSeqPipelineSpark - ------------------------------------------------------------
17:54:55.297 INFO  PathSeqPipelineSpark - HTSJDK Version: 2.14.3
17:54:55.297 INFO  PathSeqPipelineSpark - Picard Version: 2.17.2
17:54:55.301 INFO  PathSeqPipelineSpark - HTSJDK Defaults.BUFFER_SIZE : 131072
17:54:55.301 INFO  PathSeqPipelineSpark - HTSJDK Defaults.COMPRESSION_LEVEL : 2
17:54:55.301 INFO  PathSeqPipelineSpark - HTSJDK Defaults.CREATE_INDEX : false
17:54:55.301 INFO  PathSeqPipelineSpark - HTSJDK Defaults.CREATE_MD5 : false
17:54:55.301 INFO  PathSeqPipelineSpark - HTSJDK Defaults.CUSTOM_READER_FACTORY :
17:54:55.301 INFO  PathSeqPipelineSpark - HTSJDK Defaults.DISABLE_SNAPPY_COMPRESSOR : false
17:54:55.301 INFO  PathSeqPipelineSpark - HTSJDK Defaults.EBI_REFERENCE_SERVICE_URL_MASK : https://www.ebi.ac.uk/ena/cram/md5/%s
17:54:55.301 INFO  PathSeqPipelineSpark - HTSJDK Defaults.NON_ZERO_BUFFER_SIZE : 131072
17:54:55.301 INFO  PathSeqPipelineSpark - HTSJDK Defaults.REFERENCE_FASTA : null
17:54:55.302 INFO  PathSeqPipelineSpark - HTSJDK Defaults.SAM_FLAG_FIELD_FORMAT : DECIMAL
17:54:55.302 INFO  PathSeqPipelineSpark - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
17:54:55.302 INFO  PathSeqPipelineSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : false
17:54:55.302 INFO  PathSeqPipelineSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
17:54:55.302 INFO  PathSeqPipelineSpark - HTSJDK Defaults.USE_CRAM_REF_DOWNLOAD : false
17:54:55.302 DEBUG ConfigFactory - Configuration file values:
17:54:55.320 DEBUG ConfigFactory -      gcsMaxRetries = 20
17:54:55.320 DEBUG ConfigFactory -      samjdk.compression_level = 2
17:54:55.320 DEBUG ConfigFactory -      spark.kryoserializer.buffer.max = 512m
17:54:55.320 DEBUG ConfigFactory -      spark.driver.maxResultSize = 0
17:54:55.320 DEBUG ConfigFactory -      spark.driver.userClassPathFirst = true
17:54:55.320 DEBUG ConfigFactory -      spark.io.compression.codec = lzf
17:54:55.320 DEBUG ConfigFactory -      spark.yarn.executor.memoryOverhead = 600
17:54:55.320 DEBUG ConfigFactory -      spark.driver.extraJavaOptions =
17:54:55.320 DEBUG ConfigFactory -      spark.executor.extraJavaOptions =
17:54:55.320 DEBUG ConfigFactory -      codec_packages = [htsjdk.variant, htsjdk.tribble, org.broadinstitute.hellbender.utils.codecs]
17:54:55.321 DEBUG ConfigFactory -      cloudPrefetchBuffer = 40
17:54:55.321 DEBUG ConfigFactory -      cloudIndexPrefetchBuffer = -1
17:54:55.321 DEBUG ConfigFactory -      createOutputBamIndex = true
17:54:55.321 DEBUG ConfigFactory -      gatk_stacktrace_on_user_exception = false
17:54:55.321 DEBUG ConfigFactory -      samjdk.use_async_io_read_samtools = false
17:54:55.321 DEBUG ConfigFactory -      samjdk.use_async_io_write_samtools = true
17:54:55.321 DEBUG ConfigFactory -      samjdk.use_async_io_write_tribble = false
17:54:55.321 INFO  PathSeqPipelineSpark - Deflater: IntelDeflater
17:54:55.321 INFO  PathSeqPipelineSpark - Inflater: IntelInflater
17:54:55.321 INFO  PathSeqPipelineSpark - GCS max retries/reopens: 20
17:54:55.321 INFO  PathSeqPipelineSpark - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
17:54:55.321 INFO  PathSeqPipelineSpark - Initializing engine
17:54:55.321 INFO  PathSeqPipelineSpark - Done initializing engine
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/04/24 17:54:55 INFO SparkContext: Running Spark version 2.2.0
18/04/24 17:54:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/04/24 17:54:56 INFO SparkContext: Submitted application: PathSeqPipelineSpark
18/04/24 17:54:56 INFO SecurityManager: Changing view acls to: userx
18/04/24 17:54:56 INFO SecurityManager: Changing modify acls to: userx
18/04/24 17:54:56 INFO SecurityManager: Changing view acls groups to:
18/04/24 17:54:56 INFO SecurityManager: Changing modify acls groups to:
18/04/24 17:54:56 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(userx); groups with view permissions: Set(); users  with modify permissions: Set(userx); groups with modify permissions: Set()
18/04/24 17:54:57 INFO Utils: Successfully started service 'sparkDriver' on port 59501.
18/04/24 17:54:57 INFO SparkEnv: Registering MapOutputTracker
18/04/24 17:54:57 INFO SparkEnv: Registering BlockManagerMaster
18/04/24 17:54:57 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/04/24 17:54:57 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/04/24 17:54:57 INFO DiskBlockManager: Created local directory at /tmp/userx/blockmgr-213553f6-dd2d-455d-85ef-3ed03ae12f7f
18/04/24 17:54:57 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
18/04/24 17:54:57 INFO SparkEnv: Registering OutputCommitCoordinator
18/04/24 17:54:59 INFO Utils: Successfully started service 'SparkUI' on port 4040.
18/04/24 17:54:59 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://xx.xx.xx.16:4040
18/04/24 17:54:59 INFO SparkContext: Added JAR file:/scratch/home/int/eva/userx/bin/gatk-4.0.3.0/gatk-package-4.0.3.0-spark.jar at spark://xx.xx.xx.16:59501/jars/gatk-package-4.0.3.0-spark.jar with timestamp 1524585299210
18/04/24 17:55:01 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://xx.xx.xx.16:7077...
18/04/24 17:55:01 INFO TransportClientFactory: Successfully created connection to /xx.xx.xx.16:7077 after 52 ms (0 ms spent in bootstraps)
18/04/24 17:55:01 INFO StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20180424175501-0004
18/04/24 17:55:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20180424175501-0004/0 on worker-20180424173827-xx.xx.xx.27-59970 (xx.xx.xx.27:59970) with 16 cores
18/04/24 17:55:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20180424175501-0004/0 on hostPort xx.xx.xx.27:59970 with 16 cores, 1024.0 MB RAM
18/04/24 17:55:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20180424175501-0004/1 on worker-20180424173225-xx.xx.xx.24-35556 (xx.xx.xx.24:35556) with 16 cores
18/04/24 17:55:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20180424175501-0004/1 on hostPort xx.xx.xx.24:35556 with 16 cores, 1024.0 MB RAM
18/04/24 17:55:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20180424175501-0004/2 on worker-20180424173815-xx.xx.xx.25-56415 (xx.xx.xx.25:56415) with 16 cores
18/04/24 17:55:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20180424175501-0004/2 on hostPort xx.xx.xx.25:56415 with 16 cores, 1024.0 MB RAM
18/04/24 17:55:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20180424175501-0004/3 on worker-20180424173723-xx.xx.xx.16-54455 (xx.xx.xx.16:54455) with 16 cores
18/04/24 17:55:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20180424175501-0004/3 on hostPort xx.xx.xx.16:54455 with 16 cores, 1024.0 MB RAM
18/04/24 17:55:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20180424175501-0004/4 on worker-20180424173835-xx.xx.xx.24-37847 (xx.xx.xx.24:37847) with 16 cores
18/04/24 17:55:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20180424175501-0004/4 on hostPort xx.xx.xx.24:37847 with 16 cores, 1024.0 MB RAM
18/04/24 17:55:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20180424175501-0004/5 on worker-20180424173832-xx.xx.xx.23-49023 (xx.xx.xx.23:49023) with 16 cores
18/04/24 17:55:01 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 49734.
18/04/24 17:55:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20180424175501-0004/5 on hostPort xx.xx.xx.23:49023 with 16 cores, 1024.0 MB RAM
18/04/24 17:55:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20180424175501-0004/6 on worker-20180424173107-xx.xx.xx.25-33478 (xx.xx.xx.25:33478) with 16 cores
18/04/24 17:55:01 INFO NettyBlockTransferService: Server created on xx.xx.xx.16:49734
18/04/24 17:55:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20180424175501-0004/6 on hostPort xx.xx.xx.25:33478 with 16 cores, 1024.0 MB RAM
18/04/24 17:55:01 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
18/04/24 17:55:01 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, xx.xx.xx.16, 49734, None)
18/04/24 17:55:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20180424175501-0004/1 is now RUNNING
18/04/24 17:55:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20180424175501-0004/2 is now RUNNING
18/04/24 17:55:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20180424175501-0004/3 is now RUNNING
18/04/24 17:55:01 INFO BlockManagerMasterEndpoint: Registering block manager xx.xx.xx.16:49734 with 366.3 MB RAM, BlockManagerId(driver, xx.xx.xx.16, 49734, None)
18/04/24 17:55:01 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, xx.xx.xx.16, 49734, None)
18/04/24 17:55:01 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, xx.xx.xx.16, 49734, None)
18/04/24 17:55:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20180424175501-0004/4 is now RUNNING
18/04/24 17:55:03 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
18/04/24 17:55:03 INFO GoogleHadoopFileSystemBase: GHFS version: 1.6.3-hadoop2
18/04/24 17:55:04 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20180424175501-0004/5 is now RUNNING
18/04/24 17:55:05 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 276.0 KB, free 366.0 MB)
00:10 DEBUG: [kryo] Write: SerializableConfiguration
18/04/24 17:55:05 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 23.1 KB, free 366.0 MB)
18/04/24 17:55:05 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xx.xx.xx.16:49734 (size: 23.1 KB, free: 366.3 MB)
18/04/24 17:55:05 INFO SparkContext: Created broadcast 0 from newAPIHadoopFile at ReadsSparkSource.java:113
18/04/24 17:55:06 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20180424175501-0004/0 is now RUNNING
18/04/24 17:55:06 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20180424175501-0004/6 is now RUNNING
18/04/24 17:55:07 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (xx.xx.xx.25:54754) with ID 2
18/04/24 17:55:07 INFO BlockManagerMasterEndpoint: Registering block manager xx.xx.xx.25:41354 with 366.3 MB RAM, BlockManagerId(2, xx.xx.xx.25, 41354, None)
18/04/24 17:55:07 INFO FileInputFormat: Total input paths to process : 1
18/04/24 17:55:07 INFO SparkContext: Starting job: first at ReadsSparkSource.java:221
18/04/24 17:55:07 INFO DAGScheduler: Got job 0 (first at ReadsSparkSource.java:221) with 1 output partitions
18/04/24 17:55:07 INFO DAGScheduler: Final stage: ResultStage 0 (first at ReadsSparkSource.java:221)
18/04/24 17:55:07 INFO DAGScheduler: Parents of final stage: List()
18/04/24 17:55:07 INFO DAGScheduler: Missing parents: List()
18/04/24 17:55:07 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[2] at filter at ReadsSparkSource.java:123), which has no missing parents
18/04/24 17:55:07 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.4 KB, free 366.0 MB)
00:12 DEBUG: [kryo] Write: byte[]
18/04/24 17:55:07 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.4 KB, free 366.0 MB)
18/04/24 17:55:07 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on xx.xx.xx.16:49734 (size: 2.4 KB, free: 366.3 MB)
18/04/24 17:55:07 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
18/04/24 17:55:07 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[2] at filter at ReadsSparkSource.java:123) (first 15 tasks are for partitions Vector(0))
18/04/24 17:55:07 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
18/04/24 17:55:07 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, xx.xx.xx.25, executor 2, partition 0, PROCESS_LOCAL, 4956 bytes)
18/04/24 17:55:10 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (xx.xx.xx.23:45879) with ID 5
18/04/24 17:55:10 INFO BlockManagerMasterEndpoint: Registering block manager xx.xx.xx.23:42535 with 366.3 MB RAM, BlockManagerId(5, xx.xx.xx.23, 42535, None)
18/04/24 17:55:12 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (xx.xx.xx.25:54758) with ID 6
18/04/24 17:55:12 INFO BlockManagerMasterEndpoint: Registering block manager xx.xx.xx.25:37532 with 366.3 MB RAM, BlockManagerId(6, xx.xx.xx.25, 37532, None)
18/04/24 17:55:16 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (xx.xx.xx.24:55974) with ID 1
18/04/24 17:55:21 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (xx.xx.xx.27:38987) with ID 0
18/04/24 17:55:23 INFO BlockManagerMasterEndpoint: Registering block manager xx.xx.xx.27:46181 with 366.3 MB RAM, BlockManagerId(0, xx.xx.xx.27, 46181, None)
18/04/24 17:55:23 INFO BlockManagerMasterEndpoint: Registering block manager xx.xx.xx.24:49966 with 366.3 MB RAM, BlockManagerId(1, xx.xx.xx.24, 49966, None)
18/04/24 17:55:25 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (xx.xx.xx.16:55574) with ID 3
18/04/24 17:55:25 INFO BlockManagerMasterEndpoint: Registering block manager xx.xx.xx.16:39037 with 366.3 MB RAM, BlockManagerId(3, xx.xx.xx.16, 39037, None)
18/04/24 17:55:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on xx.xx.xx.25:41354 (size: 2.4 KB, free: 366.3 MB)
18/04/24 17:55:27 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (xx.xx.xx.24:55977) with ID 4
18/04/24 17:55:27 INFO BlockManagerMasterEndpoint: Registering block manager xx.xx.xx.24:35903 with 366.3 MB RAM, BlockManagerId(4, xx.xx.xx.24, 35903, None)
18/04/24 17:55:29 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xx.xx.xx.25:41354 (size: 23.1 KB, free: 366.3 MB)
00:35 DEBUG: [kryo] Read: Object[]
18/04/24 17:55:30 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 22599 ms on xx.xx.xx.25 (executor 2) (1/1)
18/04/24 17:55:30 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
18/04/24 17:55:30 INFO DAGScheduler: ResultStage 0 (first at ReadsSparkSource.java:221) finished in 22.620 s
18/04/24 17:55:30 INFO DAGScheduler: Job 0 finished: first at ReadsSparkSource.java:221, took 22.826113 s
18/04/24 17:55:30 INFO SparkContext: Starting job: collect at ReadsSparkSource.java:233
18/04/24 17:55:30 INFO DAGScheduler: Got job 1 (collect at ReadsSparkSource.java:233) with 2 output partitions
18/04/24 17:55:30 INFO DAGScheduler: Final stage: ResultStage 1 (collect at ReadsSparkSource.java:233)
18/04/24 17:55:30 INFO DAGScheduler: Parents of final stage: List()
18/04/24 17:55:30 INFO DAGScheduler: Missing parents: List()
18/04/24 17:55:30 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[3] at mapPartitions at ReadsSparkSource.java:224), which has no missing parents
18/04/24 17:55:30 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 4.9 KB, free 366.0 MB)
00:35 DEBUG: [kryo] Write: byte[]
18/04/24 17:55:30 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 2.6 KB, free 366.0 MB)
18/04/24 17:55:30 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on xx.xx.xx.16:49734 (size: 2.6 KB, free: 366.3 MB)
18/04/24 17:55:30 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006
18/04/24 17:55:30 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 1 (MapPartitionsRDD[3] at mapPartitions at ReadsSparkSource.java:224) (first 15 tasks are for partitions Vector(0, 1))
18/04/24 17:55:30 INFO TaskSchedulerImpl: Adding task set 1.0 with 2 tasks
18/04/24 17:55:30 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, xx.xx.xx.23, executor 5, partition 0, PROCESS_LOCAL, 4956 bytes)
18/04/24 17:55:30 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 2, xx.xx.xx.16, executor 3, partition 1, PROCESS_LOCAL, 4956 bytes)
18/04/24 17:55:40 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on xx.xx.xx.23:42535 (size: 2.6 KB, free: 366.3 MB)
18/04/24 17:55:42 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xx.xx.xx.23:42535 (size: 23.1 KB, free: 366.3 MB)
00:48 DEBUG: [kryo] Read: Object[]
18/04/24 17:55:43 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 12841 ms on xx.xx.xx.23 (executor 5) (1/2)
18/04/24 17:55:51 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on xx.xx.xx.16:39037 (size: 2.6 KB, free: 366.3 MB)
18/04/24 17:55:53 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xx.xx.xx.16:39037 (size: 23.1 KB, free: 366.3 MB)
00:59 DEBUG: [kryo] Read: Object[]
18/04/24 17:55:53 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 2) in 23674 ms on xx.xx.xx.16 (executor 3) (2/2)
18/04/24 17:55:53 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
18/04/24 17:55:53 INFO DAGScheduler: ResultStage 1 (collect at ReadsSparkSource.java:233) finished in 23.677 s
18/04/24 17:55:53 INFO DAGScheduler: Job 1 finished: collect at ReadsSparkSource.java:233, took 23.708100 s
18/04/24 17:55:54 INFO SparkContext: Starting job: count at PathSeqPipelineSpark.java:245
18/04/24 17:55:54 INFO DAGScheduler: Registering RDD 19 (mapToPair at PSFilter.java:125)
18/04/24 17:55:54 INFO DAGScheduler: Registering RDD 23 (mapToPair at PSFilter.java:125)
18/04/24 17:55:54 INFO DAGScheduler: Registering RDD 27 (mapToPair at PSFilter.java:162)
18/04/24 17:55:54 INFO DAGScheduler: Registering RDD 32 (mapToPair at PSFilter.java:125)
18/04/24 17:55:54 INFO DAGScheduler: Registering RDD 37 (mapToPair at PSFilter.java:125)
18/04/24 17:55:54 INFO DAGScheduler: Got job 2 (count at PathSeqPipelineSpark.java:245) with 2 output partitions
18/04/24 17:55:54 INFO DAGScheduler: Final stage: ResultStage 7 (count at PathSeqPipelineSpark.java:245)
18/04/24 17:55:54 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 6)
18/04/24 17:55:54 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 6)
18/04/24 17:55:54 INFO DAGScheduler: Submitting ShuffleMapStage 2 (MapPartitionsRDD[19] at mapToPair at PSFilter.java:125), which has no missing parents
18/04/24 17:55:54 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 14.2 KB, free 366.0 MB)
00:59 DEBUG: [kryo] Write: byte[]
18/04/24 17:55:54 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 6.4 KB, free 366.0 MB)
18/04/24 17:55:54 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on xx.xx.xx.16:49734 (size: 6.4 KB, free: 366.3 MB)
18/04/24 17:55:54 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1006
18/04/24 17:55:54 INFO DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 2 (MapPartitionsRDD[19] at mapToPair at PSFilter.java:125) (first 15 tasks are for partitions Vector(0, 1))
18/04/24 17:55:54 INFO TaskSchedulerImpl: Adding task set 2.0 with 2 tasks
00:59 DEBUG: [kryo] Write: WrappedArray([NC_000913.3_127443_127875_0:0:0_0:0:0_a507 UNMAPPED, NC_000913.3_127443_127875_0:0:0_0:0:0_a507 UNMAPPED])
18/04/24 17:55:54 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 3, xx.xx.xx.25, executor 2, partition 0, PROCESS_LOCAL, 6010 bytes)
00:59 DEBUG: [kryo] Write: WrappedArray(null)
18/04/24 17:55:54 INFO TaskSetManager: Starting task 1.0 in stage 2.0 (TID 4, xx.xx.xx.16, executor 3, partition 1, PROCESS_LOCAL, 5371 bytes)
18/04/24 17:55:54 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on xx.xx.xx.16:39037 (size: 6.4 KB, free: 366.3 MB)
18/04/24 17:55:54 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on xx.xx.xx.25:41354 (size: 6.4 KB, free: 366.3 MB)
**18/04/24 17:55:54 WARN TaskSetManager: Lost task 1.0 in stage 2.0 (TID 4, xx.xx.xx.16, executor 3): org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile: Couldn't read file. Error was: hg19mini.hss with exception: hg19mini.hss (No such file or directory)
        at org.broadinstitute.hellbender.utils.gcs.BucketUtils.openFile(BucketUtils.java:112)
        at org.broadinstitute.hellbender.tools.spark.pathseq.PSKmerUtils.readKmerFilter(PSKmerUtils.java:131)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilter.<init>(ContainsKmerReadFilter.java:27)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilterSpark.call(ContainsKmerReadFilterSpark.java:35)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilterSpark.call(ContainsKmerReadFilterSpark.java:15)
        at org.apache.spark.api.java.JavaRDD$$anonfun$filter$1.apply(JavaRDD.scala:78)
        at org.apache.spark.api.java.JavaRDD$$anonfun$filter$1.apply(JavaRDD.scala:78)
        at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
        at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: hg19mini.hss (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at java.io.FileInputStream.<init>(FileInputStream.java:93)
        at org.broadinstitute.hellbender.utils.gcs.BucketUtils.openFile(BucketUtils.java:103)
        ... 16 more**

00:59 DEBUG: [kryo] Write: WrappedArray(null)
18/04/24 17:55:54 INFO TaskSetManager: Starting task 1.1 in stage 2.0 (TID 5, xx.xx.xx.24, executor 1, partition 1, PROCESS_LOCAL, 5371 bytes)
18/04/24 17:55:54 INFO TaskSetManager: Lost task 0.0 in stage 2.0 (TID 3) on xx.xx.xx.25, executor 2: org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile (Couldn't read file. Error was: hg19mini.hss with exception: hg19mini.hss (No such file or directory)) [duplicate 1]
01:00 DEBUG: [kryo] Write: WrappedArray([NC_000913.3_127443_127875_0:0:0_0:0:0_a507 UNMAPPED, NC_000913.3_127443_127875_0:0:0_0:0:0_a507 UNMAPPED])
18/04/24 17:55:54 INFO TaskSetManager: Starting task 0.1 in stage 2.0 (TID 6, xx.xx.xx.16, executor 3, partition 0, PROCESS_LOCAL, 6010 bytes)
18/04/24 17:55:55 INFO TaskSetManager: Lost task 0.1 in stage 2.0 (TID 6) on xx.xx.xx.16, executor 3: org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile (Couldn't read file. Error was: hg19mini.hss with exception: hg19mini.hss (No such file or directory)) [duplicate 2]
01:00 DEBUG: [kryo] Write: WrappedArray([NC_000913.3_127443_127875_0:0:0_0:0:0_a507 UNMAPPED, NC_000913.3_127443_127875_0:0:0_0:0:0_a507 UNMAPPED])
18/04/24 17:55:55 INFO TaskSetManager: Starting task 0.2 in stage 2.0 (TID 7, xx.xx.xx.23, executor 5, partition 0, PROCESS_LOCAL, 6010 bytes)
18/04/24 17:55:55 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on xx.xx.xx.23:42535 (size: 6.4 KB, free: 366.3 MB)
18/04/24 17:55:55 INFO TaskSetManager: Lost task 0.2 in stage 2.0 (TID 7) on xx.xx.xx.23, executor 5: org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile (Couldn't read file. Error was: hg19mini.hss with exception: hg19mini.hss (No such file or directory)) [duplicate 3]
01:00 DEBUG: [kryo] Write: WrappedArray([NC_000913.3_127443_127875_0:0:0_0:0:0_a507 UNMAPPED, NC_000913.3_127443_127875_0:0:0_0:0:0_a507 UNMAPPED])
18/04/24 17:55:55 INFO TaskSetManager: Starting task 0.3 in stage 2.0 (TID 8, xx.xx.xx.24, executor 4, partition 0, PROCESS_LOCAL, 6010 bytes)
18/04/24 17:56:00 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on xx.xx.xx.24:49966 (size: 6.4 KB, free: 366.3 MB)
18/04/24 17:56:04 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xx.xx.xx.24:49966 (size: 23.1 KB, free: 366.3 MB)
18/04/24 17:56:07 WARN TaskSetManager: Lost task 1.1 in stage 2.0 (TID 5, xx.xx.xx.24, executor 1): org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile: Couldn't read file. Error was: hg19mini.hss with exception: hg19mini.hss (No such file or directory)
        at org.broadinstitute.hellbender.utils.gcs.BucketUtils.openFile(BucketUtils.java:112)
        at org.broadinstitute.hellbender.tools.spark.pathseq.PSKmerUtils.readKmerFilter(PSKmerUtils.java:131)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilter.<init>(ContainsKmerReadFilter.java:27)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilterSpark.call(ContainsKmerReadFilterSpark.java:35)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilterSpark.call(ContainsKmerReadFilterSpark.java:15)
        at org.apache.spark.api.java.JavaRDD$$anonfun$filter$1.apply(JavaRDD.scala:78)
        at org.apache.spark.api.java.JavaRDD$$anonfun$filter$1.apply(JavaRDD.scala:78)
        at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
        at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: hg19mini.hss (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at java.io.FileInputStream.<init>(FileInputStream.java:93)
        at org.broadinstitute.hellbender.utils.gcs.BucketUtils.openFile(BucketUtils.java:103)
        ... 16 more

01:12 DEBUG: [kryo] Write: WrappedArray(null)
18/04/24 17:56:07 INFO TaskSetManager: Starting task 1.2 in stage 2.0 (TID 9, xx.xx.xx.27, executor 0, partition 1, PROCESS_LOCAL, 5371 bytes)
18/04/24 17:56:37 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on xx.xx.xx.27:46181 (size: 6.4 KB, free: 366.3 MB)
18/04/24 17:56:38 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xx.xx.xx.27:46181 (size: 23.1 KB, free: 366.3 MB)
18/04/24 17:56:39 WARN TaskSetManager: Lost task 1.2 in stage 2.0 (TID 9, xx.xx.xx.27, executor 0): org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile: Couldn't read file. Error was: hg19mini.hss with exception: hg19mini.hss (No such file or directory)
        at org.broadinstitute.hellbender.utils.gcs.BucketUtils.openFile(BucketUtils.java:112)
        at org.broadinstitute.hellbender.tools.spark.pathseq.PSKmerUtils.readKmerFilter(PSKmerUtils.java:131)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilter.<init>(ContainsKmerReadFilter.java:27)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilterSpark.call(ContainsKmerReadFilterSpark.java:35)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilterSpark.call(ContainsKmerReadFilterSpark.java:15)
        at org.apache.spark.api.java.JavaRDD$$anonfun$filter$1.apply(JavaRDD.scala:78)
        at org.apache.spark.api.java.JavaRDD$$anonfun$filter$1.apply(JavaRDD.scala:78)
        at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
        at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: hg19mini.hss (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at java.io.FileInputStream.<init>(FileInputStream.java:93)
        at org.broadinstitute.hellbender.utils.gcs.BucketUtils.openFile(BucketUtils.java:103)
        ... 16 more

01:44 DEBUG: [kryo] Write: WrappedArray(null)
18/04/24 17:56:39 INFO TaskSetManager: Starting task 1.3 in stage 2.0 (TID 10, xx.xx.xx.16, executor 3, partition 1, PROCESS_LOCAL, 5371 bytes)
18/04/24 17:56:39 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on xx.xx.xx.24:35903 (size: 6.4 KB, free: 366.3 MB)
18/04/24 17:56:39 INFO TaskSetManager: Lost task 1.3 in stage 2.0 (TID 10) on xx.xx.xx.16, executor 3: org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile (Couldn't read file. Error was: hg19mini.hss with exception: hg19mini.hss (No such file or directory)) [duplicate 1]
18/04/24 17:56:39 ERROR TaskSetManager: Task 1 in stage 2.0 failed 4 times; aborting job
18/04/24 17:56:39 INFO TaskSchedulerImpl: Cancelling stage 2
18/04/24 17:56:39 INFO TaskSchedulerImpl: Stage 2 was cancelled
18/04/24 17:56:39 INFO DAGScheduler: ShuffleMapStage 2 (mapToPair at PSFilter.java:125) failed in 45.219 s due to Job aborted due to stage failure: Task 1 in stage 2.0 failed 4 times, most recent failure: Lost task 1.3 in stage 2.0 (TID 10, xx.xx.xx.16, executor 3): org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile: Couldn't read file. Error was: hg19mini.hss with exception: hg19mini.hss (No such file or directory)
        at org.broadinstitute.hellbender.utils.gcs.BucketUtils.openFile(BucketUtils.java:112)
        at org.broadinstitute.hellbender.tools.spark.pathseq.PSKmerUtils.readKmerFilter(PSKmerUtils.java:131)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilter.<init>(ContainsKmerReadFilter.java:27)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilterSpark.call(ContainsKmerReadFilterSpark.java:35)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilterSpark.call(ContainsKmerReadFilterSpark.java:15)
        at org.apache.spark.api.java.JavaRDD$$anonfun$filter$1.apply(JavaRDD.scala:78)
        at org.apache.spark.api.java.JavaRDD$$anonfun$filter$1.apply(JavaRDD.scala:78)
        at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
        at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: hg19mini.hss (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at java.io.FileInputStream.<init>(FileInputStream.java:93)
        at org.broadinstitute.hellbender.utils.gcs.BucketUtils.openFile(BucketUtils.java:103)
        ... 16 more

Driver stacktrace:
18/04/24 17:56:39 INFO DAGScheduler: Job 2 failed: count at PathSeqPipelineSpark.java:245, took 45.308012 s
18/04/24 17:56:39 INFO SparkUI: Stopped Spark web UI at http://xx.xx.xx.16:4040
18/04/24 17:56:39 INFO StandaloneSchedulerBackend: Shutting down all executors
18/04/24 17:56:39 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down
18/04/24 17:56:39 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/04/24 17:56:39 INFO MemoryStore: MemoryStore cleared
18/04/24 17:56:39 INFO BlockManager: BlockManager stopped
18/04/24 17:56:39 INFO BlockManagerMaster: BlockManagerMaster stopped
18/04/24 17:56:39 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/04/24 17:56:39 INFO SparkContext: Successfully stopped SparkContext
17:56:39.758 INFO  PathSeqPipelineSpark - Shutting down engine
[April 24, 2018 5:56:39 PM CEST] org.broadinstitute.hellbender.tools.spark.pathseq.PathSeqPipelineSpark done. Elapsed time: 1.75 minutes.
Runtime.totalMemory()=821559296
org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 2.0 failed 4 times, most recent failure: Lost task 1.3 in stage 2.0 (TID 10, xx.xx.xx.16, executor 3): org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile: Couldn't read file. Error was: hg19mini.hss with exception: hg19mini.hss (No such file or directory)
        at org.broadinstitute.hellbender.utils.gcs.BucketUtils.openFile(BucketUtils.java:112)
        at org.broadinstitute.hellbender.tools.spark.pathseq.PSKmerUtils.readKmerFilter(PSKmerUtils.java:131)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilter.<init>(ContainsKmerReadFilter.java:27)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilterSpark.call(ContainsKmerReadFilterSpark.java:35)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilterSpark.call(ContainsKmerReadFilterSpark.java:15)
        at org.apache.spark.api.java.JavaRDD$$anonfun$filter$1.apply(JavaRDD.scala:78)
        at org.apache.spark.api.java.JavaRDD$$anonfun$filter$1.apply(JavaRDD.scala:78)
        at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
        at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: hg19mini.hss (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at java.io.FileInputStream.<init>(FileInputStream.java:93)
        at org.broadinstitute.hellbender.utils.gcs.BucketUtils.openFile(BucketUtils.java:103)
        ... 16 more

Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
        at scala.Option.foreach(Option.scala:257)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2043)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2062)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2087)
        at org.apache.spark.rdd.RDD.count(RDD.scala:1158)
        at org.apache.spark.api.java.JavaRDDLike$class.count(JavaRDDLike.scala:455)
        at org.apache.spark.api.java.AbstractJavaRDDLike.count(JavaRDDLike.scala:45)
        at org.broadinstitute.hellbender.tools.spark.pathseq.PathSeqPipelineSpark.runTool(PathSeqPipelineSpark.java:245)
        at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.runPipeline(GATKSparkTool.java:387)
        at org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram.doWork(SparkCommandLineProgram.java:30)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:134)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:179)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:198)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
        at org.broadinstitute.hellbender.Main.main(Main.java:289)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile: Couldn't read file. Error was: hg19mini.hss with exception: hg19mini.hss (No such file or directory)
        at org.broadinstitute.hellbender.utils.gcs.BucketUtils.openFile(BucketUtils.java:112)
        at org.broadinstitute.hellbender.tools.spark.pathseq.PSKmerUtils.readKmerFilter(PSKmerUtils.java:131)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilter.<init>(ContainsKmerReadFilter.java:27)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilterSpark.call(ContainsKmerReadFilterSpark.java:35)
        at org.broadinstitute.hellbender.tools.spark.pathseq.ContainsKmerReadFilterSpark.call(ContainsKmerReadFilterSpark.java:15)
        at org.apache.spark.api.java.JavaRDD$$anonfun$filter$1.apply(JavaRDD.scala:78)
        at org.apache.spark.api.java.JavaRDD$$anonfun$filter$1.apply(JavaRDD.scala:78)
        at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
        at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: hg19mini.hss (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at java.io.FileInputStream.<init>(FileInputStream.java:93)
        at org.broadinstitute.hellbender.utils.gcs.BucketUtils.openFile(BucketUtils.java:103)

`

SZLux commented 6 years ago

Thanks to a suggestion of mwalker174, I solved the issue. I had to specify, for each input and output file, the full path from the root. All the HPC nodes share the same hdfs, so it worked.

What leaves me a bit perplexed, is that the task took 5 minutes to run on a 16 cores nodes, and exactly the same to run on a master-worker setup with 5 workers and 16 cores each (the log cofirms that the tasks were distributed to the workers IP addreses). Is this something expected? Shall I maybe try with alarger input to see the difference in performances? Thank you!

mwalker174 commented 6 years ago

Thanks for posting this in a new ticket @SZLux.

I'm not surprised that 1 vs 5 nodes did not make a difference on the tutorial files. The 5 minutes is likely overhead with setting up the Spark cluster (though that sounds a bit long). I would expect you to see better scaling with larger microbe/host references and samples.