bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
985 stars 353 forks source link

Errors in joint squaring off/backfilling step when joint calling with GATK #3621

Open DrMcStrange opened 2 years ago

DrMcStrange commented 2 years ago

Hi,

I've been hitting errors when running some tests of germline small variant joint calling with GATK (trying to optimise some parameters for running on our cluster).

First there were I/O errors that looked like this:

[TileDB::FileSystem] Error: (write_to_file) Cannot write to file; File closing error path=/scratch/bennetm/bcb-test/cataract/bcb-test-gatk-gvcf-28core/work/joint/gatk-haplotype-joint/cataract/chr1/cataract-chr1_0_16564279_genomicsdb/chr1$1$16564279/.__36a5c44a-8e42-44f1-9c43-cf447a6b9b03140534314284800_1645573900915/ALT.tdb errno=5(Input/output error)
[TileDB::WriteState] Error: Cannot write segment to file.
11:30:40.682 erro  NativeGenomicsDB - pid=1788625 tid=1790223 VariantStorageManagerException exception : Error while writing to TileDB array
TileDB error message : [TileDB::WriteState] Error: Cannot write segment to file
[TileDB::FileSystem] Error: (write_to_file) Cannot write to file; File closing error path=/scratch/bennetm/bcb-test/cataract/bcb-test-gatk-gvcf-28core/work/joint/gatk-haplotype-joint/cataract/chr1/cataract-chr1_0_16564279_genomicsdb/chr1$1$16564279/.__36a5c44a-8e42-44f1-9c43-cf447a6b9b03140534314284800_1645573900915/ALT.tdb errno=5(Input/output error)
terminate called after throwing an instance of 'VariantStorageManagerException'
  what():  VariantStorageManagerException exception : Error while writing to TileDB array
TileDB error message : [TileDB::WriteState] Error: Cannot write segment to file
[TileDB::FileSystem] Error: (write_to_file) Cannot write to file; File closing error path=/scratch/bennetm/bcb-test/cataract/bcb-test-gatk-gvcf-28core/work/joint/gatk-haplotype-joint/cataract/chr1/cataract-chr1_0_16564279_genomicsdb/chr1$1$16564279/.__36a5c44a-8e42-44f1-9c43-cf447a6b9b03140534314284800_1645573900915/ALT.tdb errno=5(Input/output error)
Using GATK jar /share/apps/rosalind/bcbio-nextgen/1.2.9/anaconda/envs/java/share/gatk4-4.2.4.1-0/gatk-package-4.2.4.1-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xms681m -Xmx3181m -XX:+UseSerialGC -XX:+UseSerialGC -XX:+UseSerialGC -Djava.io.tmpdir=/scratch/bennetm/bcb-test/cataract/bcb-test-gatk-gvcf-28core/work/joint/gatk-haplotype-joint/cataract/chr1/bcbiotx/tmpjfr4ec3p -jar /share/apps/rosalind/bcbio-nextgen/1.2.9/anaconda/envs/java/share/gatk4-4.2.4.1-0/gatk-package-4.2.4.1-local.jar GenomicsDBImport --reader-threads 1 --genomicsdb-workspace-path cataract-chr1_0_16564279_genomicsdb -L chr1:1-16564279 --sample-name-map /scratch/bennetm/bcb-test/cataract/bcb-test-gatk-gvcf-28core/work/bcbiotx/tmpzpnbpvgy/tmp3m16ha88.tsv
' returned non-zero exit status 250.

I deleted the contents of /scratch/bennetm/bcb-test/cataract/bcb-test-gatk-gvcf-28core/work/joint/gatk-haplotype-joint/cataract/ and restarted the run in place, and it then hit different errors, like so:

java.lang.IllegalStateException: Genotype [CRCH13_73 CA*/CAAAAA GQ 99 DP 47 AD 32,2,4,9,0,0,0,0 {SB=13,19,15,0}] does not contain likelihoods necessary to calculate posteriors.
    at org.broadinstitute.hellbender.tools.walkers.genotyper.afcalc.AlleleFrequencyCalculator.log10NormalizedGenotypePosteriors(AlleleFrequencyCalculator.java:89)
    at org.broadinstitute.hellbender.tools.walkers.genotyper.afcalc.AlleleFrequencyCalculator.effectiveAlleleCounts(AlleleFrequencyCalculator.java:258)
    at org.broadinstitute.hellbender.tools.walkers.genotyper.afcalc.AlleleFrequencyCalculator.calculate(AlleleFrequencyCalculator.java:141)
    at org.broadinstitute.hellbender.tools.walkers.genotyper.GenotypingEngine.calculateGenotypes(GenotypingEngine.java:147)
    at org.broadinstitute.hellbender.tools.walkers.GenotypeGVCFsEngine.calculateGenotypes(GenotypeGVCFsEngine.java:244)
    at org.broadinstitute.hellbender.tools.walkers.GenotypeGVCFsEngine.regenotypeVC(GenotypeGVCFsEngine.java:152)
    at org.broadinstitute.hellbender.tools.walkers.GenotypeGVCFsEngine.callRegion(GenotypeGVCFsEngine.java:135)
    at org.broadinstitute.hellbender.tools.walkers.GenotypeGVCFs.apply(GenotypeGVCFs.java:283)
    at org.broadinstitute.hellbender.engine.VariantLocusWalker.lambda$traverse$0(VariantLocusWalker.java:135)
    at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
    at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
    at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
    at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
    at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.base/java.util.stream.ReferencePipeline.forEachOrdered(ReferencePipeline.java:502)
    at org.broadinstitute.hellbender.engine.VariantLocusWalker.traverse(VariantLocusWalker.java:132)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1085)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
    at org.broadinstitute.hellbender.Main.main(Main.java:289)
Using GATK jar /share/apps/rosalind/bcbio-nextgen/1.2.9/anaconda/envs/java/share/gatk4-4.2.4.1-0/gatk-package-4.2.4.1-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xms681m -Xmx3181m -XX:+UseSerialGC -Djava.io.tmpdir=/scratch/bennetm/bcb-test/cataract/bcb-test-gatk-gvcf-28core/work/bcbiotx/tmpdk0ngdfk -jar /share/apps/rosalind/bcbio-nextgen/1.2.9/anaconda/envs/java/share/gatk4-4.2.4.1-0/gatk-package-4.2.4.1-local.jar GenotypeGVCFs --variant gendb:///scratch/bennetm/bcb-test/cataract/bcb-test-gatk-gvcf-28core/work/joint/gatk-haplotype-joint/cataract/chr1/cataract-chr1_0_16564279_genomicsdb -R /share/apps/rosalind/bcbio-nextgen/1.2.9/genomes/Hsapiens/hg38/seq/hg38.fa --output /scratch/bennetm/bcb-test/cataract/bcb-test-gatk-gvcf-28core/work/bcbiotx/tmp9k9tijkp/cataract-chr1_0_16564279.vcf.gz -L chr1:1-16564279 -ploidy 2
' returned non-zero exit status 3.

Full log files are attached, and here's the yaml for this run:

details:
- algorithm:
    aligner: bwa
    jointcaller: gatk-haplotype-joint
    tools_on:
    - gvcf
    - gemini
    variantcaller: gatk-haplotype
  analysis: variant2
  description: CRCH13_73
  files:
  - /scratch/bennetm/bcb-test/cataract/input/CRCH13_73_R1.fastq.gz
  - /scratch/bennetm/bcb-test/cataract/input/CRCH13_73_R2.fastq.gz
  genome_build: hg38
  metadata:
    batch: cataract
    sex: female
- algorithm:
    aligner: bwa
    jointcaller: gatk-haplotype-joint
    tools_on:
    - gvcf
    - gemini
    variant_regions: /u/bennetm/Scratch/bcb-test/cataract/S07604514_Padded.bed
    variantcaller: gatk-haplotype
  analysis: variant2
  description: CRCH14_05
  files:
  - /scratch/bennetm/bcb-test/cataract/input/CRCH14_05_R1.fastq.gz
  - /scratch/bennetm/bcb-test/cataract/input/CRCH14_05_R2.fastq.gz
  genome_build: hg38
  metadata:
    batch: cataract
    sex: male
- algorithm:
    aligner: bwa
    jointcaller: gatk-haplotype-joint
    tools_on:
    - gvcf
    - gemini
    variantcaller: gatk-haplotype
  analysis: variant2
  description: CSA69_03
  files:
  - /scratch/bennetm/bcb-test/cataract/input/CSA69_03_R1.fastq.gz
  - /scratch/bennetm/bcb-test/cataract/input/CSA69_03_R2.fastq.gz
  genome_build: hg38
  metadata:
    batch: cataract
    sex: male
fc_name: bcb-test-gatk-gvcf-28core
upload:
  dir: ../final

Does anyone have any ideas on how to resolve these?

bcbio-nextgen-commands.log bcbio-nextgen-debug.log

naumenko-sa commented 2 years ago

Hi @DrMcStrange !

Are you sure you need joint genotyping for 3 samples? I thought it was a use case for population calling with >50 samples, with 3 samples you could just use batch calling.

Sergey

DrMcStrange commented 2 years ago

Hi @naumenko-sa

This was just a test run! I would normally use batch calling, but I wanted to compare calls between joint and batch calling.

In the meantime a colleague has hit the same IllegalStateException error on a run with ~60 samples. (The second of the two errors my run hit.) She was only calling a few small regions, so we got it to complete by switching to batch calling, but it would still be good if we could find a fix for this.

Bennet