Closed Fazulur closed 4 years ago
Hi Fazulur @Fazulur !
Sorry about the issue. I think we have just recently fixed it here: https://github.com/bcbio/bcbio-nextgen/issues/3160
Please upgrade with bcbio_nextgen.py upgrade -u skip --genomes hg38
and try again.
Sergey
Hi Sergey,
Thanks a lot for your quick response. I upgraded bcbio using above command. But it is giving same error again.
Could you please suggest me how can I proceed further.
Thanks In Advance Fazulur Rehaman
Hi @Fazulur !
Try to check your data/genomes/Hsapiens/hg38/seq/hg38-resources.yaml
If the update was successful, it should have a line: dbsnp: ../variation/dbsnp-153.vcf.gz
Also check whether dbsnp file is installed in:
genomes/Hsapiens/hg38/variation/dbsnp-153.vcf.gz
Update bcbio to 1.2.3 just in case:
bcbio_nextgen.py upgrade -u stable --tools
Sergey
Hi @Fazulur @naumenko-sa
I had the similar problem. I found that in my v.1.2.0 --genomes hg38 install the genomes/config/vcfanno/gemini.conf
has dbsnp-151.vcf.gz
but in the genomes/variation
folder the dbsnp is 153: genomes/variation/dbsnp-153.vcf.gz
.
I have modified gemini.conf
changing the dbsnp-151 into dbsnp-153 and the problem seems to go away.
Hi @erinijapranckeviciene
Thanks a lot. It worked.
Thanks & Regards Fazulur Rehaman
Hi @Sergey,
I have modified gemini.conf as per @erinijapranckeviciene. I tested whole exome & RNA bcbio pipelines and they are working fine.
When I tried testing one whole genome 30X sample with gatk-variant pipeline and it is giving the below error at haplotypecaller step
java.lang.ArrayIndexOutOfBoundsException: 3 at org.broadinstitute.hellbender.utils.GenotypeUtils.computeDiploidGenotypeCounts(GenotypeUtils.java:70) at org.broadinstitute.hellbender.tools.walkers.annotator.ExcessHet.calculateEH(ExcessHet.java:86) at org.broadinstitute.hellbender.tools.walkers.annotator.ExcessHet.annotate(ExcessHet.java:74) at org.broadinstitute.hellbender.tools.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:293) at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerGenotypingEngine.makeAnnotatedCall(HaplotypeCallerGenotypingEngine.java:365) at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerGenotypingEngine.assignGenotypeLikelihoods(HaplotypeCallerGenotypingEngine.java:189) at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.callRegion(HaplotypeCallerEngine.java:608) at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller.apply(HaplotypeCaller.java:212) at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:200) at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:173) at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1048) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206) at org.broadinstitute.hellbender.Main.main(Main.java:292) Using GATK jar BCBIO/v1.2.0_updated/anaconda/share/gatk4-4.1.6.0-0/gatk-package-4.1.6.0-local.jar Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xms4g -Xmx13g -XX:+UseSerialGC -Djava.io.tmpdir=/gpfs/ngsdata/scratch/fazulur/testcases/bcbio/test-bcbio-1.2.0-wgs/scratch/bcbiotx/tmpmalwhvqr -jar BCBIO/v1.2.0_updated/anaconda/share/gatk4-4.1.6.0-0/gatk-package-4.1.6.0-local.jar HaplotypeCaller -R BCBIO/v1.2.0_updated/genomes/Hsapiens/hg38/seq/hg38.fa --annotation MappingQualityRankSumTest --annotation MappingQualityZero --annotation QualByDepth --annotation ReadPosRankSumTest --annotation RMSMappingQuality --annotation BaseQualityRankSumTest --annotation FisherStrand --annotation MappingQuality --annotation DepthPerAlleleBySample --annotation Coverage -Iest-bcbio-1.2.0-wgs/scratch/align/0200233341_S5_L008/0200233341_S5_L008-sort-recal.bam -Lest-bcbio-1.2.0-wgs/scratch/gatk-haplotype/chrY/0200233341_S5_L008-joint-chrY_0_16127313-regions.bed --interval-set-rule INTERSECTION --annotation ClippingRankSumTest --annotation DepthPerSampleHC --native-pair-hmm-threads 1 --emit-ref-confidence GVCF -GQB 10 -GQB 20 -GQB 30 -GQB 40 -GQB 60 -GQB 80 -ploidy 1 --outputest-bcbio-1.2.0-wgs/scratch/bcbiotx/tmpmalwhvqr/0200233341_S5_L008-joint-chrY_0_16127313.vcf.gz ' returned non-zero exit status 3.
Could you please help me to resolve this error.
Thanks In Advance Fazulur Rehaman
Hi @Fazulur !
It looks like an exception from GATK HaplotypeCaller triggered by
0200233341_S5_L008/0200233341_S5_L008-sort-recal.bam
and
0200233341_S5_L008-joint-chrY_0_16127313-regions.bed
input.
Can you run this last command outside of bcbio, i.e. to make sure whether it is bcbio error or gatk error?
Sergey
Hi @Sergey,
You are right. This error is with new version of GATK 4.1.6.0.
Do we need to downgrade GATK version & proceed. Please suggest me how can we proceed further?
Thanks In Advance Fazulur Rehaman
Hi @Fazulur !
If you can reproduce the issue outside of bcbio, then could you please raise it with GATK team and provide them the two files to reproduce? https://github.com/broadinstitute/gatk/issues
It is possible to downgrade gatk with conda install -c bioconda --force-reinstall gatk4=[version]
,
but we have aligned bcbio wrapper to support the latest gatk already and it is better to solve the issue for everyone, since you were first to identify it.
Sergey
Thanks @Fazulur ! I see you have raised the issue: https://github.com/broadinstitute/gatk/issues/6552. linking it here for tracking. SN
upd: modified gemini.conf here: https://github.com/bcbio/bcbio-nextgen/blob/master/config/vcfanno/hg38-gemini.conf
upd: fixed in gatk repo, waiting for new release
Dear @Sergey,
Thanks a lot for update.
Thanks & Regards Fazulur Rehaman
Dear @Sergey,
GATK released 4.1.7.0 with fix to this issue https://github.com/broadinstitute/gatk/releases
Could you please let us know when bcbio will be ready with GATK 4.1.7.0.
Thanks In Advance Fazulur Rehaman
Thanks @Fazulur !
We are not pinning gatk4.
Just update with bcbio_nextgen.py -u skip --tools
and run again.
It seems to work for me.
Let us know if you see any other bcbio issues!
Sergey
Dear @Sergey,
Thanks a lot. It is working now without any issues.
Thanks & Regards Fazulur Rehaman
Dear Bcbio team,
We installed bcbio new version v1.2.0 & tried to run gatk-variant pipeline. It is giving the below error. We tried adding tools_on: gemini to yaml file. Still It is giving errors.
[2020-04-07T11:23Z] nsnode44: Not running gemini, not configured in tools_on: brain [2020-04-07T11:23Z] nsnode44: Unexpected error Traceback (most recent call last): File "BCBIO/v1.2.0_updated/anaconda/lib/python3.6/site-packages/bcbio/distributed/ipythontasks.py", line 54, in _setup_logging yield config File "BCBIO/v1.2.0_updated/anaconda/lib/python3.6/site-packages/bcbio/distributed/ipythontasks.py", line 463, in prep_gemini_db return ipython.zip_args(apply(population.prep_gemini_db, args)) File "BCBIO/v1.2.0_updated/anaconda/lib/python3.6/site-packages/bcbio/distributed/ipythontasks.py", line 82, in apply return object(args, kwargs) File "BCBIO/v1.2.0_updated/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 42, in prep_gemini_db ann_vcf = run_vcfanno(gemini_vcf, data, decomposed) File "BCBIO/v1.2.0_updated/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 121, in run_vcfanno decomposed=decomposed) File "BCBIO/v1.2.0_updated/anaconda/lib/python3.6/site-packages/bcbio/variation/vcfanno.py", line 35, in run conffn = _combine_files(conf_fns, out_file, data, basepath is None) File "BCBIO/v1.2.0_updated/anaconda/lib/python3.6/site-packages/bcbio/variation/vcfanno.py", line 63, in _combine_files line = _fill_file_path(line, data) File "BCBIO/v1.2.0_updated/anaconda/lib/python3.6/site-packages/bcbio/variation/vcfanno.py", line 88, in _fill_file_path assert full_file, "Did not find vcfanno input file %s" % (orig_file) AssertionError: Did not find vcfanno input file variation/dbsnp-151.vcf.gz**
And our sample configuration yaml file is below
details:
algorithm: aligner: bwa recalibrate: gatk tools_on: [gvcf] tools_off: [vqsr]
variantcaller: [gatk-haplotype] analysis: variant2 description: 'brain' files:
Could you please let us know how to proceed further.
Thanks In Advance Fazulur Rehaman