broadinstitute / gatk

Official code repository for GATK versions 4 and up
https://software.broadinstitute.org/gatk
Other
1.69k stars 588 forks source link

Request created from: Issue when running BaseRecalibrator #8005

Open GATKSupportTeam opened 2 years ago

GATKSupportTeam commented 2 years ago

This request was created from a contribution made by Duo Xie on August 20, 2022 16:16 UTC.

Link: https://gatk.broadinstitute.org/hc/en-us/community/posts/8235601014427-Issue-when-running-BaseRecalibrator

--

REQUIRED for all errors and issues:

a) GATK version used:v4.2.6.1 

b) Exact command used: see below

c) Entire program log: see below

How can I assign a temp directory and won't get the bug?

I always got error when I assigned the temp directory:

/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk --java-options "-Xmx8G -Djava.io.tmpdir=/data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/shell/temp" BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.rmdup.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz  -O /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.recal_data.test.table

Using GATK jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar

Running:

    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx8G -Djava.io.tmpdir=/data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/shell/temp -jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.rmdup.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz -O /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.recal_data.test.table

00:09:41.541 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so

00:09:41.554 WARN  NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No such file or directory)

00:09:41.557 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so

00:09:41.558 WARN  NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No such file or directory)

00:09:41.678 INFO  BaseRecalibrator - ------------------------------------------------------------

00:09:41.679 INFO  BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.2.6.1

00:09:41.679 INFO  BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/

00:09:41.679 INFO  BaseRecalibrator - Executing as xieduo@pbs-master on Linux v3.10.0-1160.41.1.el7.x86_64 amd64

00:09:41.679 INFO  BaseRecalibrator - Java runtime: Java HotSpot(TM) 64-Bit Server VM v18+36-2087

00:09:41.680 INFO  BaseRecalibrator - Start Date/Time: August 21, 2022 at 12:09:41 AM CST

00:09:41.680 INFO  BaseRecalibrator - ------------------------------------------------------------

00:09:41.680 INFO  BaseRecalibrator - ------------------------------------------------------------

00:09:41.681 INFO  BaseRecalibrator - HTSJDK Version: 2.24.1

00:09:41.681 INFO  BaseRecalibrator - Picard Version: 2.27.1

00:09:41.681 INFO  BaseRecalibrator - Built for Spark Version: 2.4.5

00:09:41.681 INFO  BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2

00:09:41.681 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false

00:09:41.681 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true

00:09:41.681 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false

00:09:41.682 INFO  BaseRecalibrator - Deflater: JdkDeflater

00:09:41.682 INFO  BaseRecalibrator - Inflater: JdkInflater

00:09:41.682 INFO  BaseRecalibrator - GCS max retries/reopens: 20

00:09:41.682 INFO  BaseRecalibrator - Requester pays: disabled

00:09:41.682 INFO  BaseRecalibrator - Initializing engine

00:09:41.884 WARN  IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater

00:09:41.888 WARN  IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater

00:09:42.030 WARN  IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater

00:09:42.036 INFO  BaseRecalibrator - Shutting down engine

[August 21, 2022 at 12:09:42 AM CST] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.01 minutes.

Runtime.totalMemory()=1140850688

org.broadinstitute.hellbender.exceptions.GATKException: Unable to automatically instantiate codec org.broadinstitute.hellbender.utils.codecs.AnnotatedIntervalCodec

    at org.broadinstitute.hellbender.engine.FeatureManager.getCandidateCodecsForFile(FeatureManager.java:535)

    at org.broadinstitute.hellbender.engine.FeatureManager.getCodecForFile(FeatureManager.java:482)

    at org.broadinstitute.hellbender.engine.FeatureDataSource.getCodecForFeatureInput(FeatureDataSource.java:397)

    at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:373)

    at org.broadinstitute.hellbender.engine.FeatureDataSource.(FeatureDataSource.java:319)

    at org.broadinstitute.hellbender.engine.FeatureDataSource.(FeatureDataSource.java:291)

    at org.broadinstitute.hellbender.engine.FeatureManager.addToFeatureSources(FeatureManager.java:245)

    at org.broadinstitute.hellbender.engine.FeatureManager.initializeFeatureSources(FeatureManager.java:208)

    at org.broadinstitute.hellbender.engine.FeatureManager.(FeatureManager.java:155)

    at org.broadinstitute.hellbender.engine.ReadWalker.initializeFeatures(ReadWalker.java:72)

    at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:726)

    at org.broadinstitute.hellbender.engine.ReadWalker.onStartup(ReadWalker.java:51)

    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)

    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)

    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)

    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)

    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)

    at org.broadinstitute.hellbender.Main.main(Main.java:289)

And I will get the same error when I assign the temp directory in another way:

/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk --java-options "-Xmx30G" BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.rmdup.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz  -O /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.recal_data.table --tmp-dir /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam

Using GATK jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar

Running:

    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx30G -jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.rmdup.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz -O /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.recal_data.table --tmp-dir /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam

00:11:11.683 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so

00:11:11.697 WARN  NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No such file or directory)

00:11:11.700 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so

00:11:11.700 WARN  NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No such file or directory)

00:11:11.812 INFO  BaseRecalibrator - ------------------------------------------------------------

00:11:11.813 INFO  BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.2.6.1

00:11:11.813 INFO  BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/

00:11:11.813 INFO  BaseRecalibrator - Executing as xieduo@pbs-master on Linux v3.10.0-1160.41.1.el7.x86_64 amd64

00:11:11.813 INFO  BaseRecalibrator - Java runtime: Java HotSpot(TM) 64-Bit Server VM v18+36-2087

00:11:11.813 INFO  BaseRecalibrator - Start Date/Time: August 21, 2022 at 12:11:11 AM CST

00:11:11.813 INFO  BaseRecalibrator - ------------------------------------------------------------

00:11:11.813 INFO  BaseRecalibrator - ------------------------------------------------------------

00:11:11.814 INFO  BaseRecalibrator - HTSJDK Version: 2.24.1

00:11:11.814 INFO  BaseRecalibrator - Picard Version: 2.27.1

00:11:11.814 INFO  BaseRecalibrator - Built for Spark Version: 2.4.5

00:11:11.814 INFO  BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2

00:11:11.814 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false

00:11:11.814 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true

00:11:11.814 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false

00:11:11.814 INFO  BaseRecalibrator - Deflater: JdkDeflater

00:11:11.815 INFO  BaseRecalibrator - Inflater: JdkInflater

00:11:11.815 INFO  BaseRecalibrator - GCS max retries/reopens: 20

00:11:11.815 INFO  BaseRecalibrator - Requester pays: disabled

00:11:11.815 INFO  BaseRecalibrator - Initializing engine

00:11:12.005 WARN  IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater

00:11:12.009 WARN  IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater

00:11:12.127 WARN  IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater

00:11:12.134 INFO  BaseRecalibrator - Shutting down engine

[August 21, 2022 at 12:11:12 AM CST] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.01 minutes.

Runtime.totalMemory()=285212672

org.broadinstitute.hellbender.exceptions.GATKException: Unable to automatically instantiate codec org.broadinstitute.hellbender.utils.codecs.AnnotatedIntervalCodec

    at org.broadinstitute.hellbender.engine.FeatureManager.getCandidateCodecsForFile(FeatureManager.java:535)

    at org.broadinstitute.hellbender.engine.FeatureManager.getCodecForFile(FeatureManager.java:482)

    at org.broadinstitute.hellbender.engine.FeatureDataSource.getCodecForFeatureInput(FeatureDataSource.java:397)

    at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:373)

    at org.broadinstitute.hellbender.engine.FeatureDataSource.(FeatureDataSource.java:319)

    at org.broadinstitute.hellbender.engine.FeatureDataSource.(FeatureDataSource.java:291)

    at org.broadinstitute.hellbender.engine.FeatureManager.addToFeatureSources(FeatureManager.java:245)

    at org.broadinstitute.hellbender.engine.FeatureManager.initializeFeatureSources(FeatureManager.java:208)

    at org.broadinstitute.hellbender.engine.FeatureManager.(FeatureManager.java:155)

    at org.broadinstitute.hellbender.engine.ReadWalker.initializeFeatures(ReadWalker.java:72)

    at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:726)

    at org.broadinstitute.hellbender.engine.ReadWalker.onStartup(ReadWalker.java:51)

    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)

    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)

    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)

    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)

    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)

    at org.broadinstitute.hellbender.Main.main(Main.java:289)

However, the bug wasn't reported when I didn't assign the temp directory:

/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk --java-options "-Xmx30G" BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.rmdup.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz  -O /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.recal_data.test.table

Using GATK jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar

Running:

    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx30G -jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.rmdup.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz -O /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.recal_data.test.table

00:12:20.992 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so

00:12:21.140 INFO  BaseRecalibrator - ------------------------------------------------------------

00:12:21.141 INFO  BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.2.6.1

00:12:21.141 INFO  BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/

00:12:21.141 INFO  BaseRecalibrator - Executing as xieduo@pbs-master on Linux v3.10.0-1160.41.1.el7.x86_64 amd64

00:12:21.141 INFO  BaseRecalibrator - Java runtime: Java HotSpot(TM) 64-Bit Server VM v18+36-2087

00:12:21.142 INFO  BaseRecalibrator - Start Date/Time: August 21, 2022 at 12:12:20 AM CST

00:12:21.142 INFO  BaseRecalibrator - ------------------------------------------------------------

00:12:21.142 INFO  BaseRecalibrator - ------------------------------------------------------------

00:12:21.142 INFO  BaseRecalibrator - HTSJDK Version: 2.24.1

00:12:21.143 INFO  BaseRecalibrator - Picard Version: 2.27.1

00:12:21.143 INFO  BaseRecalibrator - Built for Spark Version: 2.4.5

00:12:21.143 INFO  BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2

00:12:21.143 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false

00:12:21.143 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true

00:12:21.143 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false

00:12:21.143 INFO  BaseRecalibrator - Deflater: IntelDeflater

00:12:21.144 INFO  BaseRecalibrator - Inflater: IntelInflater

00:12:21.144 INFO  BaseRecalibrator - GCS max retries/reopens: 20

00:12:21.144 INFO  BaseRecalibrator - Requester pays: disabled

00:12:21.144 INFO  BaseRecalibrator - Initializing engine

00:12:21.485 INFO  FeatureManager - Using codec VCFCodec to read file file:///data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz

00:12:21.565 INFO  FeatureManager - Using codec VCFCodec to read file file:///data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz

00:12:21.688 INFO  FeatureManager - Using codec VCFCodec to read file file:///data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz

00:12:21.797 WARN  IndexUtils - Feature file "file:///data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file

00:12:21.895 WARN  IntelInflater - Zero Bytes Written : 0

00:12:21.966 INFO  BaseRecalibrator - Done initializing engine

00:12:21.969 INFO  BaseRecalibrationEngine - The covariates being used here:

00:12:21.969 INFO  BaseRecalibrationEngine -     ReadGroupCovariate

00:12:21.969 INFO  BaseRecalibrationEngine -     QualityScoreCovariate

00:12:21.969 INFO  BaseRecalibrationEngine -     ContextCovariate

00:12:21.969 INFO  BaseRecalibrationEngine -     CycleCovariate

00:12:22.016 INFO  ProgressMeter - Starting traversal

00:12:22.017 INFO  ProgressMeter -        Current Locus  Elapsed Minutes       Reads Processed     Reads/Minute

How can I assign a temp directory and won't get the bug?

I set the gatk environment using conda:

/data/xieduo/WES_pipe/pipeline/bin/Miniconda3/bin/conda env create -n gatk_4.2.6.1 -f gatkcondaenv.yml

Thank you!

Best,

Duo

(created from Zendesk ticket #293634)
gz#293634

lbergelson commented 2 years ago

@AJDCiarla We suspect the issue might be related to the non-ascii character in the tmp file path. Łuksza_2022_Nature

Could you ask them to try to use a temp folder / output folders that have only english characters? We've run into various problems with java mishandling things that use non-ascii characters before and it seems like a good thing to try.

xieduo7 commented 2 years ago

Hi @lbergelson and @AJDCiarla ,

Thank you for your working!

I am the one who reported this bug and I had given it a try as @lbergelson suggested. I performed tests in two different scenarios:

  1. Using full path without any non-ascii characters as tmp path and it succeeded:

    /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk --java-options "-Xmx8G -Djava.io.tmpdir=/data/xieduo/gatktest" BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz  -O  PAAD11N.recal_data.test.table
    Using GATK jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx8G -Djava.io.tmpdir=/data/xieduo/gatktest -jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz -O PAAD11N.recal_data.test.table
    13:35:32.710 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
    13:35:32.890 INFO  BaseRecalibrator - ------------------------------------------------------------
    13:35:32.891 INFO  BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.2.6.1
    13:35:32.891 INFO  BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
    13:35:32.891 INFO  BaseRecalibrator - Executing as xieduo@pbs-master on Linux v3.10.0-1160.41.1.el7.x86_64 amd64
    13:35:32.891 INFO  BaseRecalibrator - Java runtime: Java HotSpot(TM) 64-Bit Server VM v18+36-2087
    13:35:32.891 INFO  BaseRecalibrator - Start Date/Time: September 22, 2022 at 1:35:32 PM CST
    13:35:32.891 INFO  BaseRecalibrator - ------------------------------------------------------------
    13:35:32.892 INFO  BaseRecalibrator - ------------------------------------------------------------
    13:35:32.892 INFO  BaseRecalibrator - HTSJDK Version: 2.24.1
    13:35:32.892 INFO  BaseRecalibrator - Picard Version: 2.27.1
    13:35:32.893 INFO  BaseRecalibrator - Built for Spark Version: 2.4.5
    13:35:32.893 INFO  BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    13:35:32.893 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    13:35:32.893 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    13:35:32.893 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    13:35:32.893 INFO  BaseRecalibrator - Deflater: IntelDeflater
    13:35:32.893 INFO  BaseRecalibrator - Inflater: IntelInflater
    13:35:32.894 INFO  BaseRecalibrator - GCS max retries/reopens: 20
    13:35:32.894 INFO  BaseRecalibrator - Requester pays: disabled
    13:35:32.894 INFO  BaseRecalibrator - Initializing engine
    13:35:33.276 INFO  FeatureManager - Using codec VCFCodec to read file file:///data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz
    13:35:33.545 INFO  FeatureManager - Using codec VCFCodec to read file file:///data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz
    13:35:33.884 INFO  FeatureManager - Using codec VCFCodec to read file file:///data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
    13:35:34.129 WARN  IndexUtils - Feature file "file:///data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file
    13:35:34.232 WARN  IntelInflater - Zero Bytes Written : 0
    13:35:34.282 INFO  BaseRecalibrator - Done initializing engine
    13:35:34.285 INFO  BaseRecalibrationEngine - The covariates being used here:
    13:35:34.285 INFO  BaseRecalibrationEngine -    ReadGroupCovariate
    13:35:34.285 INFO  BaseRecalibrationEngine -    QualityScoreCovariate
    13:35:34.285 INFO  BaseRecalibrationEngine -    ContextCovariate
    13:35:34.285 INFO  BaseRecalibrationEngine -    CycleCovariate
    13:35:34.344 INFO  ProgressMeter - Starting traversal
    13:35:34.344 INFO  ProgressMeter -        Current Locus  Elapsed Minutes       Reads Processed     Reads/Minute
    13:35:44.363 INFO  ProgressMeter -         chr1:5384544              0.2                214000        1281820.9
  2. Using full path with non-ascii characters in base directory as tmp path and it failed:

    /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk --java-options "-Xmx8G -Djava.io.tmpdir=/data/xieduo/Łuksza_2022_Nature" BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz  -O  PAAD11N.recal_data.test.table
    Using GATK jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx8G -Djava.io.tmpdir=/data/xieduo/Łuksza_2022_Nature -jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz -O PAAD11N.recal_data.test.table
    13:36:33.528 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
    13:36:33.547 WARN  NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No such file or directory)
    13:36:33.550 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
    13:36:33.551 WARN  NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No such file or directory)
    13:36:33.669 INFO  BaseRecalibrator - ------------------------------------------------------------
    13:36:33.670 INFO  BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.2.6.1
    13:36:33.670 INFO  BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
    13:36:33.670 INFO  BaseRecalibrator - Executing as xieduo@pbs-master on Linux v3.10.0-1160.41.1.el7.x86_64 amd64
    13:36:33.670 INFO  BaseRecalibrator - Java runtime: Java HotSpot(TM) 64-Bit Server VM v18+36-2087
    13:36:33.671 INFO  BaseRecalibrator - Start Date/Time: September 22, 2022 at 1:36:33 PM CST
    13:36:33.671 INFO  BaseRecalibrator - ------------------------------------------------------------
    13:36:33.671 INFO  BaseRecalibrator - ------------------------------------------------------------
    13:36:33.672 INFO  BaseRecalibrator - HTSJDK Version: 2.24.1
    13:36:33.672 INFO  BaseRecalibrator - Picard Version: 2.27.1
    13:36:33.672 INFO  BaseRecalibrator - Built for Spark Version: 2.4.5
    13:36:33.672 INFO  BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    13:36:33.672 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    13:36:33.672 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    13:36:33.673 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    13:36:33.673 INFO  BaseRecalibrator - Deflater: JdkDeflater
    13:36:33.673 INFO  BaseRecalibrator - Inflater: JdkInflater
    13:36:33.673 INFO  BaseRecalibrator - GCS max retries/reopens: 20
    13:36:33.673 INFO  BaseRecalibrator - Requester pays: disabled
    13:36:33.673 INFO  BaseRecalibrator - Initializing engine
    13:36:33.867 WARN  IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
    13:36:33.870 WARN  IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
    13:36:33.995 WARN  IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
    13:36:34.002 INFO  BaseRecalibrator - Shutting down engine
    [September 22, 2022 at 1:36:34 PM CST] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.01 minutes.
    Runtime.totalMemory()=1140850688
    org.broadinstitute.hellbender.exceptions.GATKException: Unable to automatically instantiate codec org.broadinstitute.hellbender.utils.codecs.AnnotatedIntervalCodec
    at org.broadinstitute.hellbender.engine.FeatureManager.getCandidateCodecsForFile(FeatureManager.java:535)
    at org.broadinstitute.hellbender.engine.FeatureManager.getCodecForFile(FeatureManager.java:482)
    at org.broadinstitute.hellbender.engine.FeatureDataSource.getCodecForFeatureInput(FeatureDataSource.java:397)
    at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:373)
    at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:319)
    at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:291)
    at org.broadinstitute.hellbender.engine.FeatureManager.addToFeatureSources(FeatureManager.java:245)
    at org.broadinstitute.hellbender.engine.FeatureManager.initializeFeatureSources(FeatureManager.java:208)
    at org.broadinstitute.hellbender.engine.FeatureManager.<init>(FeatureManager.java:155)
    at org.broadinstitute.hellbender.engine.ReadWalker.initializeFeatures(ReadWalker.java:72)
    at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:726)
    at org.broadinstitute.hellbender.engine.ReadWalker.onStartup(ReadWalker.java:51)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
    at org.broadinstitute.hellbender.Main.main(Main.java:289)
  3. Change work directory into /data/xieduo/Immun_genomics/data/Łuksza2022Nature and used ./ as tmp directory. It also failed:

    
    cd /data/xieduo/Immun_genomics/data/Łuksza2022Nature
    /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk --java-options "-Xmx8G -Djava.io.tmpdir=./" BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz  -O  PAAD11N.recal_data.test.table
    Using GATK jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx8G -Djava.io.tmpdir=./ -jar /data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar BaseRecalibrator -R /data/reference/gatk_resource/Homo_sapiens_assembly38.fasta -I /data/xieduo/Immun_genomics/data/Łuksza_2022_Nature/bam/PAAD11N.bam --known-sites /data/xieduo/WES_pipe/pipeline/gatk_resource/dbsnp_146.hg38.vcf.gz --known-sites /data/reference/gatk_resource/1000G_phase1.snps.high_confidence.hg38.vcf.gz --known-sites /data/reference/gatk_resource/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz -O PAAD11N.recal_data.test.table
    13:46:24.742 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
    13:46:24.761 WARN  NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No such file or directory)
    13:46:24.764 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/xieduo/WES_pipe/pipeline/bin/gatk-4.2.6.1/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
    13:46:24.764 WARN  NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No such file or directory)
    13:46:24.884 INFO  BaseRecalibrator - ------------------------------------------------------------
    13:46:24.884 INFO  BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.2.6.1
    13:46:24.885 INFO  BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
    13:46:24.885 INFO  BaseRecalibrator - Executing as xieduo@pbs-master on Linux v3.10.0-1160.41.1.el7.x86_64 amd64
    13:46:24.885 INFO  BaseRecalibrator - Java runtime: Java HotSpot(TM) 64-Bit Server VM v18+36-2087
    13:46:24.885 INFO  BaseRecalibrator - Start Date/Time: September 22, 2022 at 1:46:24 PM CST
    13:46:24.885 INFO  BaseRecalibrator - ------------------------------------------------------------
    13:46:24.885 INFO  BaseRecalibrator - ------------------------------------------------------------
    13:46:24.886 INFO  BaseRecalibrator - HTSJDK Version: 2.24.1
    13:46:24.886 INFO  BaseRecalibrator - Picard Version: 2.27.1
    13:46:24.886 INFO  BaseRecalibrator - Built for Spark Version: 2.4.5
    13:46:24.886 INFO  BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    13:46:24.887 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    13:46:24.887 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    13:46:24.887 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    13:46:24.887 INFO  BaseRecalibrator - Deflater: JdkDeflater
    13:46:24.887 INFO  BaseRecalibrator - Inflater: JdkInflater
    13:46:24.887 INFO  BaseRecalibrator - GCS max retries/reopens: 20
    13:46:24.887 INFO  BaseRecalibrator - Requester pays: disabled
    13:46:24.888 INFO  BaseRecalibrator - Initializing engine
    13:46:25.095 WARN  IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
    13:46:25.099 WARN  IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
    13:46:25.216 WARN  IntelInflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
    13:46:25.222 INFO  BaseRecalibrator - Shutting down engine
    [September 22, 2022 at 1:46:25 PM CST] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.01 minutes.
    Runtime.totalMemory()=1140850688
    org.broadinstitute.hellbender.exceptions.GATKException: Unable to automatically instantiate codec org.broadinstitute.hellbender.utils.codecs.AnnotatedIntervalCodec
    at org.broadinstitute.hellbender.engine.FeatureManager.getCandidateCodecsForFile(FeatureManager.java:535)
    at org.broadinstitute.hellbender.engine.FeatureManager.getCodecForFile(FeatureManager.java:482)
    at org.broadinstitute.hellbender.engine.FeatureDataSource.getCodecForFeatureInput(FeatureDataSource.java:397)
    at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:373)
    at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:319)
    at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:291)
    at org.broadinstitute.hellbender.engine.FeatureManager.addToFeatureSources(FeatureManager.java:245)
    at org.broadinstitute.hellbender.engine.FeatureManager.initializeFeatureSources(FeatureManager.java:208)
    at org.broadinstitute.hellbender.engine.FeatureManager.<init>(FeatureManager.java:155)
    at org.broadinstitute.hellbender.engine.ReadWalker.initializeFeatures(ReadWalker.java:72)
    at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:726)
    at org.broadinstitute.hellbender.engine.ReadWalker.onStartup(ReadWalker.java:51)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
    at org.broadinstitute.hellbender.Main.main(Main.java:289)

However, I used to run `BaseRecalibrator` in a way like the 3rd scenario using `GATK v4.2.0.0` and it worked well:

cd /data/public/meta_mrs/Sharma_2019_CellRep/bam gatk --java-options "-Xmx8G -Djava.io.tmpdir=./" BaseRecalibrator -R $2 -I $SAMPLE.rmdup.bam --known-sites $KNOWNSITE1 --known-sites $KNOWNSITE2 --known-sites $KNOWNSITE3 -O $SAMPLE.recal_data.table


I am confused about this issue.

Thank you!

Best,
Duo