[2019-01-20T03:02Z] 03:02:13.807 WARN SparkContextFactory - Environment variables HELLBENDER_TEST_PROJECT and HELLBENDER_JSON_SERVICE_ACCOUNT_KEY must be set or the GCS hadoop connector will not be configured properly
[2019-01-20T03:02Z] 03:02:14.025 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/local/share/bcbio-nextgen/anaconda/share/gatk4-4.0.2.1-0/gatk-package-4.0.2.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
[2019-01-20T03:02Z] 03:02:14.236 INFO BaseRecalibratorSpark - ------------------------------------------------------------
[2019-01-20T03:02Z] 03:02:14.237 INFO BaseRecalibratorSpark - The Genome Analysis Toolkit (GATK) v4.0.2.1
[2019-01-20T03:02Z] 03:02:14.239 INFO BaseRecalibratorSpark - Initializing engine
[2019-01-20T03:02Z] 03:02:14.239 INFO BaseRecalibratorSpark - Done initializing engine
[2019-01-20T03:02Z] Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
[2019-01-20T03:02Z] 19/01/20 03:02:14 INFO SparkContext: Running Spark version 2.0.2
[2019-01-20T03:02Z] 19/01/20 03:02:14 WARN SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
[2019-01-20T03:02Z] 19/01/20 03:02:14 INFO SecurityManager: Changing view acls to: root
[2019-01-20T03:02Z] 19/01/20 03:02:14 INFO SecurityManager: Changing modify acls to: root
[2019-01-20T03:02Z] 19/01/20 03:02:14 INFO SecurityManager: Changing view acls groups to:
[2019-01-20T03:02Z] 19/01/20 03:02:14 INFO SecurityManager: Changing modify acls groups to:
[2019-01-20T03:02Z] 19/01/20 03:02:14 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO Utils: Successfully started service 'sparkDriver' on port 34655.
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO SparkEnv: Registering MapOutputTracker
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO SparkEnv: Registering BlockManagerMaster
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO DiskBlockManager: Created local directory at /mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/bcbiotx/tmp6L8ojf/blockmgr-18c9b4a8-4b85-4f12-8410-681f8fced403
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO MemoryStore: MemoryStore started with capacity 23.7 GB
[2019-01-20T03:02Z] 19/01/20 03:02:15 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO SparkEnv: Registering OutputCommitCoordinator
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO Utils: Successfully started service 'SparkUI' on port 4040.
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://172.17.0.2:4040
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO Executor: Starting executor ID driver on host localhost
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41089.
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO NettyBlockTransferService: Server created on localhost:41089
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, localhost, 41089)
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO BlockManagerMasterEndpoint: Registering block manager localhost:41089 with 23.7 GB RAM, BlockManagerId(driver, localhost, 41089)
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, localhost, 41089)
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO SparkUI: Stopped Spark web UI at http://172.17.0.2:4040
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO MemoryStore: MemoryStore cleared
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO BlockManager: BlockManager stopped
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO BlockManagerMaster: BlockManagerMaster stopped
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO SparkContext: Successfully stopped SparkContext
[2019-01-20T03:02Z] 03:02:15.985 INFO BaseRecalibratorSpark - Shutting down engine
[2019-01-20T03:02Z] A USER ERROR has occurred: Couldn't read the given reference, reference must be a .fasta or .2bit file.
[2019-01-20T03:02Z] Reference provided was: None
[2019-01-20T03:02Z] ***
[2019-01-20T03:02Z] Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO ShutdownHookManager: Shutdown hook called
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO ShutdownHookManager: Deleting directory /mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/bcbiotx/tmp6L8ojf/spark-5f80c1d9-0db5-41d8-9831-3672ad0362f3
[2019-01-20T03:02Z] Uncaught exception occurred
Traceback (most recent call last):
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 23, in run
_do_run(cmd, checks, log_stdout, env=env)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 103, in _do_run
03:02:13.807 WARN SparkContextFactory - Environment variables HELLBENDER_TEST_PROJECT and HELLBENDER_JSON_SERVICE_ACCOUNT_KEY must be set or the GCS hadoop connector will not be configured properly
03:02:14.025 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/local/share/bcbio-nextgen/anaconda/share/gatk4-4.0.2.1-0/gatk-package-4.0.2.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
Hi Brad,
i tried to use your base image and create my own docker image. While doing so I am getting this error. Can you help
[2019-01-20T02:57Z] Calculating coverage: platinum sv_regions
[2019-01-20T02:58Z] Calculating coverage: platinum coverage
[2019-01-20T03:00Z] samtools stats : platinum
[2019-01-20T03:02Z] samtools index stats : platinum
[2019-01-20T03:02Z] Prepare BQSR tables with GATK: platinum
[2019-01-20T03:02Z] GATK: BaseRecalibratorSpark
[2019-01-20T03:02Z] Using GATK jar /usr/local/share/bcbio-nextgen/anaconda/share/gatk4-4.0.2.1-0/gatk-package-4.0.2.1-local.jar
[2019-01-20T03:02Z] Running:
[2019-01-20T03:02Z] java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -Xms500m -Xmx45864m -Djava.io.tmpdir=/mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/bcbiotx/tmp6L8ojf -jar /usr/local/share/bcbio-nextgen/anaconda/share/gatk4-4.0.2.1-0/gatk-package-4.0.2.1-local.jar BaseRecalibratorSpark -I /mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/align/platinum/platinum-sort.bam --spark-master local[16] --output /mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/bcbiotx/tmp6L8ojf/platinum-sort-recal.grp --reference None --conf spark.driver.host=localhost --conf spark.network.timeout=800 --conf spark.local.dir=/mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/bcbiotx/tmp6L8ojf --known-sites /mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/inputs/data/genomes/GRCh37/variation/dbsnp_138.vcf.gz -L /mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/bedprep/cleaned-xgen-exome-research-panel-targets_6bpexpanded.bed --interval-set-rule INTERSECTION
[2019-01-20T03:02Z] 03:02:13.807 WARN SparkContextFactory - Environment variables HELLBENDER_TEST_PROJECT and HELLBENDER_JSON_SERVICE_ACCOUNT_KEY must be set or the GCS hadoop connector will not be configured properly
[2019-01-20T03:02Z] 03:02:14.025 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/local/share/bcbio-nextgen/anaconda/share/gatk4-4.0.2.1-0/gatk-package-4.0.2.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
[2019-01-20T03:02Z] 03:02:14.236 INFO BaseRecalibratorSpark - ------------------------------------------------------------
[2019-01-20T03:02Z] 03:02:14.237 INFO BaseRecalibratorSpark - The Genome Analysis Toolkit (GATK) v4.0.2.1
[2019-01-20T03:02Z] 03:02:14.237 INFO BaseRecalibratorSpark - For support and documentation go to https://software.broadinstitute.org/gatk/
[2019-01-20T03:02Z] 03:02:14.237 INFO BaseRecalibratorSpark - Executing as root@c35737b5c33a on Linux v4.14.88-72.73.amzn1.x86_64 amd64
[2019-01-20T03:02Z] 03:02:14.238 INFO BaseRecalibratorSpark - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_121-b15
[2019-01-20T03:02Z] 03:02:14.238 INFO BaseRecalibratorSpark - Start Date/Time: January 20, 2019 3:02:13 AM UTC
[2019-01-20T03:02Z] 03:02:14.238 INFO BaseRecalibratorSpark - ------------------------------------------------------------
[2019-01-20T03:02Z] 03:02:14.238 INFO BaseRecalibratorSpark - ------------------------------------------------------------
[2019-01-20T03:02Z] 03:02:14.238 INFO BaseRecalibratorSpark - HTSJDK Version: 2.14.3
[2019-01-20T03:02Z] 03:02:14.239 INFO BaseRecalibratorSpark - Picard Version: 2.17.2
[2019-01-20T03:02Z] 03:02:14.239 INFO BaseRecalibratorSpark - HTSJDK Defaults.COMPRESSION_LEVEL : 1
[2019-01-20T03:02Z] 03:02:14.239 INFO BaseRecalibratorSpark - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
[2019-01-20T03:02Z] 03:02:14.239 INFO BaseRecalibratorSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
[2019-01-20T03:02Z] 03:02:14.239 INFO BaseRecalibratorSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
[2019-01-20T03:02Z] 03:02:14.239 INFO BaseRecalibratorSpark - Deflater: IntelDeflater
[2019-01-20T03:02Z] 03:02:14.239 INFO BaseRecalibratorSpark - Inflater: IntelInflater
[2019-01-20T03:02Z] 03:02:14.239 INFO BaseRecalibratorSpark - GCS max retries/reopens: 20
[2019-01-20T03:02Z] 03:02:14.239 INFO BaseRecalibratorSpark - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
[2019-01-20T03:02Z] 03:02:14.239 WARN BaseRecalibratorSpark -
[2019-01-20T03:02Z] !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
[2019-01-20T03:02Z] Warning: BaseRecalibratorSpark is a BETA tool and is not yet ready for use in production
[2019-01-20T03:02Z] !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
[2019-01-20T03:02Z] 03:02:14.239 INFO BaseRecalibratorSpark - Initializing engine
[2019-01-20T03:02Z] 03:02:14.239 INFO BaseRecalibratorSpark - Done initializing engine
[2019-01-20T03:02Z] Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
[2019-01-20T03:02Z] 19/01/20 03:02:14 INFO SparkContext: Running Spark version 2.0.2
[2019-01-20T03:02Z] 19/01/20 03:02:14 WARN SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
[2019-01-20T03:02Z] 19/01/20 03:02:14 INFO SecurityManager: Changing view acls to: root
[2019-01-20T03:02Z] 19/01/20 03:02:14 INFO SecurityManager: Changing modify acls to: root
[2019-01-20T03:02Z] 19/01/20 03:02:14 INFO SecurityManager: Changing view acls groups to:
[2019-01-20T03:02Z] 19/01/20 03:02:14 INFO SecurityManager: Changing modify acls groups to:
[2019-01-20T03:02Z] 19/01/20 03:02:14 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO Utils: Successfully started service 'sparkDriver' on port 34655.
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO SparkEnv: Registering MapOutputTracker
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO SparkEnv: Registering BlockManagerMaster
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO DiskBlockManager: Created local directory at /mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/bcbiotx/tmp6L8ojf/blockmgr-18c9b4a8-4b85-4f12-8410-681f8fced403
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO MemoryStore: MemoryStore started with capacity 23.7 GB
[2019-01-20T03:02Z] 19/01/20 03:02:15 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO SparkEnv: Registering OutputCommitCoordinator
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO Utils: Successfully started service 'SparkUI' on port 4040.
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://172.17.0.2:4040
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO Executor: Starting executor ID driver on host localhost
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41089.
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO NettyBlockTransferService: Server created on localhost:41089
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, localhost, 41089)
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO BlockManagerMasterEndpoint: Registering block manager localhost:41089 with 23.7 GB RAM, BlockManagerId(driver, localhost, 41089)
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, localhost, 41089)
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO SparkUI: Stopped Spark web UI at http://172.17.0.2:4040
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO MemoryStore: MemoryStore cleared
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO BlockManager: BlockManager stopped
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO BlockManagerMaster: BlockManagerMaster stopped
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO SparkContext: Successfully stopped SparkContext
[2019-01-20T03:02Z] 03:02:15.985 INFO BaseRecalibratorSpark - Shutting down engine
[2019-01-20T03:02Z] [January 20, 2019 3:02:15 AM UTC] org.broadinstitute.hellbender.tools.spark.BaseRecalibratorSpark done. Elapsed time: 0.03 minutes.
[2019-01-20T03:02Z] Runtime.totalMemory()=569901056
[2019-01-20T03:02Z] ***
[2019-01-20T03:02Z] A USER ERROR has occurred: Couldn't read the given reference, reference must be a .fasta or .2bit file.
[2019-01-20T03:02Z] Reference provided was: None
[2019-01-20T03:02Z] ***
[2019-01-20T03:02Z] Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO ShutdownHookManager: Shutdown hook called
[2019-01-20T03:02Z] 19/01/20 03:02:15 INFO ShutdownHookManager: Deleting directory /mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/bcbiotx/tmp6L8ojf/spark-5f80c1d9-0db5-41d8-9831-3672ad0362f3
[2019-01-20T03:02Z] Uncaught exception occurred
Traceback (most recent call last):
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 23, in run
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 103, in _do_run
CalledProcessError: Command 'set -o pipefail; export SPARK_USER=root && unset JAVA_HOME && export PATH=/usr/local/share/bcbio-nextgen/anaconda/bin:$PATH && gatk-launch --java-options '-Xms500m -Xmx45864m -Djava.io.tmpdir=/mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/bcbiotx/tmp6L8ojf' BaseRecalibratorSpark -I /mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/align/platinum/platinum-sort.bam --spark-master local[16] --output /mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/bcbiotx/tmp6L8ojf/platinum-sort-recal.grp --reference None --conf spark.driver.host=localhost --conf spark.network.timeout=800 --conf spark.local.dir=/mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/bcbiotx/tmp6L8ojf --known-sites /mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/inputs/data/genomes/GRCh37/variation/dbsnp_138.vcf.gz -L /mnt/scratch/d21a4131-e111-4644-9d0d-7f88561e326c/bedprep/cleaned-xgen-exome-research-panel-targets_6bpexpanded.bed --interval-set-rule INTERSECTION
Using GATK jar /usr/local/share/bcbio-nextgen/anaconda/share/gatk4-4.0.2.1-0/gatk-package-4.0.2.1-local.jar
Running:
03:02:13.807 WARN SparkContextFactory - Environment variables HELLBENDER_TEST_PROJECT and HELLBENDER_JSON_SERVICE_ACCOUNT_KEY must be set or the GCS hadoop connector will not be configured properly
03:02:14.025 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/local/share/bcbio-nextgen/anaconda/share/gatk4-4.0.2.1-0/gatk-package-4.0.2.1-local.jar!/com/intel/gkl/native/libgkl_compression.so