Closed gabeng closed 2 years ago
Hi Ben @gabeng!
Could you please clarify your setup? Are you running:
tools_on:
coverage_perbase
Sergey
Hi Sergey,
yes, I am using coverage_perbase
.
Hi Ben @gabeng !
I'm running:
details:
- algorithm:
aligner: bwa
effects: vep
effects_transcripts: all
ensemble:
numpass: 2
use_filtered: false
mark_duplicates: true
realign: false
recalibrate: false
save_diskspace: true
tools_on:
- gemini
- svplots
- qualimap
- vep_splicesite_annotations
- noalt_calling
- coverage_perbase
variantcaller:
- gatk-haplotype
- samtools
- platypus
- freebayes
vcfanno:
- /n/data1/cores/bcbio/naumenko/ashkenazim_trio/config/cre.vcfanno.conf
analysis: variant2
description: HG002_NA24385_son
files:
- /n/data1/cores/bcbio/naumenko/ashkenazim_trio/input/HG002_NA24385_son.chr22.bam
genome_build: hg38
metadata:
batch: ashkenazi_fam
- algorithm:
aligner: bwa
effects: vep
effects_transcripts: all
ensemble:
numpass: 2
use_filtered: false
mark_duplicates: true
realign: false
recalibrate: false
save_diskspace: true
tools_on:
- gemini
- svplots
- qualimap
- vep_splicesite_annotations
- noalt_calling
- coverage_perbase
variantcaller:
- gatk-haplotype
- samtools
- platypus
- freebayes
vcfanno:
- /n/data1/cores/bcbio/naumenko/ashkenazim_trio/config/cre.vcfanno.conf
analysis: variant2
description: HG003_NA24149_father
files:
- /n/data1/cores/bcbio/naumenko/ashkenazim_trio/input/HG003_NA24149_father.chr22.bam
genome_build: hg38
metadata:
batch: ashkenazi_fam
- algorithm:
aligner: bwa
effects: vep
effects_transcripts: all
ensemble:
numpass: 2
use_filtered: false
mark_duplicates: true
realign: false
recalibrate: false
save_diskspace: true
tools_on:
- gemini
- svplots
- qualimap
- vep_splicesite_annotations
- noalt_calling
- coverage_perbase
variantcaller:
- gatk-haplotype
- samtools
- platypus
- freebayes
vcfanno:
- /n/data1/cores/bcbio/naumenko/ashkenazim_trio/config/cre.vcfanno.conf
analysis: variant2
description: HG004_NA24143_mother
files:
- /n/data1/cores/bcbio/naumenko/ashkenazim_trio/input/HG004_NA24143_mother.chr22.bam
genome_build: hg38
metadata:
batch: ashkenazi_fam
resources:
default:
cores: 7
jvm_opts:
- -Xms750m
- -Xmx7000m
memory: 7G
upload:
dir: ../final
and work/coverage/HG002_NA24385_son has:
HG002_NA24385_son-variant_regions.mosdepth.region.dist.txt
HG002_NA24385_son-variant_regions.per-base.bed.gz
HG002_NA24385_son-variant_regions.quantized.bed.gz
HG002_NA24385_son-variant_regions.quantized-vrsubset.bed
HG002_NA24385_son-variant_regions.quantized-vrsubset-callableblocks.bed
HG002_NA24385_son-variant_regions.quantized-vrsubset-nblocks.bed
HG002_NA24385_son-variant_regions.regions.bed.gz
target-genome.bed
There are no csi indices. Can you please elaborate some more? I did not get what was the issue.
Sergey
Hi Sergey,
Thanks for looking into this. The mosdepth manual states that there are always *.csi
index files created for every *.gz
file (see https://github.com/brentp/mosdepth#exome-example)
In particular the .per-base.bed.gz
files can be pretty large so indexing is mandatory for downstream processing. For some reason, it seems, the *.csi
files are lost in bcbio. It would be very helpful if they would be retained and stored with the *.bed.gz
files.
Ben
It appears that mosdepth creates *.csi index files for all bgzipped files like
per-base-coverage
. However, I cannot find those index files in the bcbio output directory. I don't find a log message that these files are actually copied into the final directory. I am currently using bcbio 1.1.7, but I did not notice any code changes with respect to those qc files. How can I get those index files transferred to the final directory?