cnr-ibba / nf-resequencing-mem

Nextflow resequencing pipeline with bwa-mem and freebayes
MIT License
0 stars 0 forks source link

:sparkles: deal with CRAM aligments #9

Closed bunop closed 6 months ago

bunop commented 2 years ago

use samtools view to convert CRAM into BAM -> CRAM convert with SAMTOOLS/CRAM module:

bunop commented 2 years ago

Some comments from the nextflow community (slack #help):

We are working to convert sarek to use cram files only (and save bam files optionally). If you want to take a look see the dev branch in sarek. Quite a few modules that we need there now allow for cram files in nf-core/modules . Generally we try to stick with one module regardless whether BAM or CRAm is used. The reference is usually passed as optional input. Some modules of the top of my head would be some of the samtools modules, some of the gatk4 modules, strelka, freebayes, and manta. You should be able to find the information if cram is an option as input in the meta.yml of the modules. The exact step where you want to start using crams depends on your use case. I benchmarked it for sarek and found that sticking with bam after mapping and letting duplicate marking take care of the conversion to cram is faster. But I think that really depends on where you want to use it.

bunop commented 2 years ago

nf-core/sarek convert bam into cram by default, and then re-convert into bam if user need bam files. Here, in this pipeline BAMADDRG steps works only on bam and PICARD_MARKDUPLICATES seems to have problems with cram files. According this, I could convert into cram only before freebayes call, without saving much space

bunop commented 10 months ago

Try to replace bamaddrg with picard AddOrReplaceReadGroups. Also try to enforce CRAM files in each step