Closed bunop closed 6 months ago
Some comments from the nextflow community (slack #help):
We are working to convert sarek to use cram files only (and save bam files optionally). If you want to take a look see the dev branch in sarek. Quite a few modules that we need there now allow for cram files in nf-core/modules . Generally we try to stick with one module regardless whether BAM or CRAm is used. The reference is usually passed as optional input. Some modules of the top of my head would be some of the samtools modules, some of the gatk4 modules, strelka, freebayes, and manta. You should be able to find the information if cram is an option as input in the meta.yml of the modules. The exact step where you want to start using crams depends on your use case. I benchmarked it for sarek and found that sticking with bam after mapping and letting duplicate marking take care of the conversion to cram is faster. But I think that really depends on where you want to use it.
nf-core/sarek convert bam into cram by default, and then re-convert into bam if user need bam files. Here, in this pipeline BAMADDRG
steps works only on bam and PICARD_MARKDUPLICATES
seems to have problems with cram files. According this, I could convert into cram only before freebayes call, without saving much space
Try to replace bamaddrg
with picard AddOrReplaceReadGroups
. Also try to enforce CRAM files in each step
use samtools view to convert CRAM into BAM -> CRAM convert with SAMTOOLS/CRAM module:
freebayes
using CRAM filessamtools view