maxplanck-ie / snakepipes

Customizable workflows based on snakemake and python for the analysis of NGS data
http://snakepipes.readthedocs.io
378 stars 85 forks source link

error in samtools sort #1021

Open sunta3iouxos opened 1 month ago

sunta3iouxos commented 1 month ago

Hi all, I got an interesting and unexpected error. Unfortunately, I can not retrieve the bam file to see what it is about:

samtools sort: failed to read header from "-"

the log:

---- This analysis has been done using snakePipes version 2.8.0 ----
Building DAG of jobs...
Falling back to greedy scheduler because no default solver is found for pulp (you have to install either coincbc or glpk).
Using shell: /bin/bash
Provided cores: 24
Rules claiming more threads will be scaled down.
Provided resources: mem_mb=1000, disk_mb=1000
Select jobs to execute...

[Mon Jul 22 20:09:09 2024]
rule bwa:
    input: FASTQ_fastp/A006200402_224949_S7_L000_R1_001.fastq.gz, FASTQ_fastp/A006200402_224949_S7_L000_R2_001.fastq.gz
    output: bwa/A006200402_224949_S7_L000.bwa_summary.txt, bwa/A006200402_224949_S7_L000.sorted.bam
    log: bwa/logs/A006200402_224949_S7_L000.sort.log
    jobid: 0
    reason: Missing output files: bwa/A006200402_224949_S7_L000.bwa_summary.txt, bwa/A006200402_224949_S7_L000.sorted.bam
    wildcards: sample=A006200402_224949_S7_L000
    threads: 8
    resources: mem_mb=1000, disk_mb=1000, tmpdir=/scratch/tgeorgom_temp

            TMPDIR=/scratch/tgeorgom/temp/
            MYTEMP=$(mktemp -d ${TMPDIR:-/tmp}/snakepipes.XXXXXXXXXX);
            bwa mem             -t 8             -R '@RG\tID:A006200402_224949_S7_L000\tDS:A006200402_224949_S7_L000\tPL:ILLUMINA\tSM:A006200402_224949_S7_L000'               FASTQ_fastp/A006200402_224949_S7_L000_R1_001.fastq.gz FASTQ_fastp/A006200402_224949_S7_L000_R2_001.fastq.gz |             samtools view -Sb - |             samtools sort -m 2G -@ 2 -O bam - > bwa/A006200402_224949_S7_L000.sorted.bam 2> bwa/logs/A006200402_224949_S7_L000.sort.log;
            rm -rf $MYTEMP
            samtools flagstat bwa/A006200402_224949_S7_L000.sorted.bam

and this is the command

nohup DNA-mapping -i /scratch/tgeorgom/bastet2.ccg.uni-koeln.de/downloads/NGS_CHN02_cnikopoulou_A006200402/ -o /scratch/tgeorgom/CH02/mm10_gencodeM19 --fastqc --trim --trimmer fastp --trimmerOptions "--trim_poly_g --trim_poly_x -Q -L --correction" --dedup --plotFormat "pdf" --mapq 2 -j 50 --dedup --aligner bwa --verbose --snakemakeOptions='--rerun-incomplete' --insertSizeMax 5000 mm10_gencodeM19_spikesTEST > nohup_CH02.txt &
sunta3iouxos commented 1 month ago

I tested the same but using the bowtie2 and it seems that it does not through this error anymore. So, it is a bwa specific issue, that migh be related to -R '@RG\tID:A006200402_224949_S7_L000\tDS:A006200402_224949_S7_L000\tPL:ILLUMINA\tSM:A006200402_224949_S7_L000'

katsikora commented 1 month ago

Hi,

this error message from samtools means that the alignment has failed e.g. no proper bam was produced (and piped into samtools). Do you mean that running the workflow with --aligner bowtie2 didn't reproduce the error?

Best wishes,

Katarzyna

sunta3iouxos commented 1 month ago

Exactly, I used the same DNA-mapping command, and only changed the --al8gner entry to bowtie. Then the whole thing worked.